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Abstract 


This study is an analysis of the distributed version of database concurrency control. It provides 
concrete mathematical evidence that the distributed problem is an inherently more complex task 


than the centralized one. 


The notions of transaction, concurrency, history, scrializability, scheduler, etc. for centralized 


databases are now well-understood both from a theoretical and a practical point of view. A formal 


model for the case of distributed databases is presented. The transactions are partially ordered sets 
of actions, as opposed to the totally ordered straight-line programs of the centralized case. The 
scheduler is also a distributed program. Three notions of performance for a scheduler are studied 
and interrelated: (i) parallelism, (ii) the computational complexity of the decision problems that it 
has to solve, (iii) the cost of communication between the various parts of the scheduler, In fact the 


‘number of messages necessary and sufficient to support a given level of parallelism is equal to the 


length of a combinatorial game. This game, which captures the difference between the centralized 
and the distributed problem, is PSPACE-Complete. This implies that unless NP=PSPACE, a 
scheduler cannot simultaneously minimize the communication cost and be computationally efficient. 


The model presented can also serve as a framework for the study of distributed concurrency 
contro! by locking. For two transactions an efficient characterization of safe distributed locking 


‘ policies is derived. The new graph-theoretic approach generalizes the geometric method used :in the 


centralized case. - 
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1. Introduction 


There is now considerable literature, both theoretical and applied, concerning 
the database concurrency control: problem - that is, maintaining the integrity of a 
database in the face of concurrent updates. Most of the theoretical. work so far has 
been concerned with the centralized problem, in which the database resides at one 
site, and the update requests are submitted to a single process, called the scheduler, 
which implements the concurrency control policy of the database [7,25,37]. There is 
also some interesting applied work on distributed databases [2,3,4,28,36]. It is often 
said that the concurrency control problem is..much trickier and harder in the 
distributed case, than in the centralized case. This is evidenced by the existing 
solutions, which are extremely complex and. sometimes. incorrect. 


In this thesis we examine how the complexity of various problems, related to 
concurrency control, is affected when we attempt to solve them for distributed 
databases. The main focus.is in two areas, serializability. and safe locking. policies, 
where efficient centralized solutions exist. Our approach and results also add to the . 
theory of distributed computation, independently of their database context. 


1.1 The Main Goals and New Results 


Our main goal is to demonstrate the differences between the centralized and 


distributed versions of natural computational problems. We examine such problems _ . 


from the area of database concurrency control, because we also wish to detertnine the 
limits of performance of concurrency control mechanisms. 


. We investigate two features of distributed computation, which distinguish it 
from centralized computation. First, the uncertainty of the order of events in a 
distributed environment [19]. The order of events is no loiiger best viewed as total, as 
in the.centralized case; instead it is a partial order, whose structure depends on the . 
number of sites of the distributed system. So our analysis will highlight differences 
~ between total and partial orders. The second element is‘the need for communication 
between sites, if the performance of an on-line distributed:system is to match that of 
an on-line centralized system. 


In order to find concrete differences we compare the computational complexity 


of centralized and distributed tasks. We will use standard concepts from the theory of 
computational complexity, (i.e., deterministic polynomial time P, nondeterministic : 
polynomial time NP or its complement co-NP and polynomial space PSPACE, 
[1,11,33,34]), as well as notions from the theory of combinatorial games [5,8,29]. The 
contributions of this thesis are summarized in the next three sections. 


(1)_The Model 


We have developed a simple mathematical model of distributed databases, 
which captures the intricacies of distributed computation that are most t peranent to 
the database domain. Some novelties of our model are: 


‘Q) User transactions are arbitrary partial orders of atomic steps, thus 
generalizing the straight-line programs of the centralized case. The order 
corresponds to both time-precedence and information flow, and it au 
the notion of “distributed time”. 


(2) The scheduler, the concurrency control agent of the system, is itself a 
distributed program, consisting of communicating sequential processes [15], 
one for each site. 


(3) Redundancy (the requirement that two entities stored at different sites be 
two copies of the same “virtual entity") is not treated at the syntactic level, 
but is considered as part of the integrity constraints of the database. 
Redundancy was at the root of the complexities of most previous attempts to 
formalize distributed databases. 


As a consequence, there are three measures of performance in a distributed 
database (centralized theory deals with the first two): 


(a) Parallelism, measured as the set of allowable interleavings of user actions. 


(b) Complexity of the computational problems that the processes of the 
scheduler must solve. 


(c) Communication, measured as the number of message exchanges between 
the processes of the scheduler. 


A simple analysis, Theorems 1 and 2, verifies that the model is s indeed a 
consistent generalization of the centralized model, 


(II) Schedulers and Games 


The three measures of performance of schedulers present interesting tradeoffs. 
For example, let us fix (a) (think of it as the parallelism specs of the system). By 
expending many messages, we can reduce the problem of distributed concurrency 
control to the centralized one (by broadcasting each request) and thus solve it in 
polynomial time for most reasonable specs: {25}. It turns.out that, based on a priori 
information about transactions, we can minimize the number of messages sent, by 
executing an exponential number of computation steps (and using polynomial space; 
this is the upper bound of our main result). Finally we cannot have a scheduler 
simultaneously using the minimum number of messages and running in polynomial 
time at each site, unless NP=PSPACE (this follows from the lower bound). 


Specifically our main result states that for a certain parallelism specification 
_ (which in fact can be fixed to be the popular serializability principle [3,17,25,31,40)) 
minimizing communication ‘costs is a computational problem complete for PSPACE. 
{1,11,33,34]. Thus, our’ result appears to be concrete mathematical evidence 
suggesting that distributed concurrency control is indeed an inherently more complex 
_ task than centralized concurrency control (under quite general conditions, centralized 
schedulers can be implemented in polynomial time [25).. 


Our result also adds to the literature on distributed computation, independently 
of its database context. It states, loosely speaking, that one cannot tell efficiently 
whether distributed processes. can cooperate successfully for performing (an 
otherwise easy) on-line computational task, at fixed communication cost. It can 
therefore be considered as complementing the result of Ladner for lockout properties 
of “antagonistic” processes [18]. On the other hand, Yao has asked [38]. whether 
minimizing communications costs for some distributed. combinational computation is 
computationally intractable; we answer this in the ease of an on-line computation. 


The proofs of both our upper and lower bounds are quite intricate. For the | 
upper bound we need a complicated characterization (Theorem 3) of the incomplete 
histories of actions (i.c., partial orders of events in the system) that can be completed, 


within a fixed number of messages. This upper bound holds for serializable histories, 
as well as for all similar parallelism specifications that can be achieved ina 
centralized manner. For the lower bound we relate distributed scheduling to a game 
played on graphs (the "conflict" graph of the transactions). Intuitively one player 
(Player II) is the distributed scheduler, and the other (Player I) is an adversary who 
submits user requests so as to force the scheduler to use as many messages as 
possible. Player I wants to prolong the game as much as possible, whereas Player [I 
tries to bring it to an end as soon as ‘possible (other than that there is no winner or 
looser), The rules are related in a simple way to the cycles of the graph.. We prove 
that this game is complete for PSPACE, and then show that our constructs can 
faithfully reflect a special kind of distributed concurrency control situation. Both 
steps involve intricate “gadget” construction (Theorem 4). 


Ill) Distributed Lockin 


A very common way of implementing concurrency control is by locking. In this 
method each entity is equipped with a binary semaphore (its lock) and transactions 
synchronize their operation by locking and unlocking the entities that they access. 
The purpose of locks is not mutual exclusion of shared resources as in operating 
system theory. Instead they are used to enforce correct sequencing of the indivisible 
transaction steps. 


Locking policies have. been extensively studied in the centralized case 
(7,13,21,26,30,39,40} and applied to distributed databases [22,23,35]. Our model 
provides a framework for the rigorous study of distributed locking. 


The most elegant result in the theory of centralized locking is a geometric 
method, which efficiently characterizes the safe locking policies for two transactions. 
We examine the distributed version of this problem (i.e., when the transactions are 
partial orders instead of total orders of steps). We propose an alternative graph- 
theoretic approach for the centralized problem, which in addition provides an - 
efficient sufficient condition for the distributed problem (Theorem 5). This condition 
is also necessary for transactions distributed at two sites (Theorem 6). Therefore this 
IS a positive result (as opposed to the negative complexity results of Chapter 4). It 
also indicates how ‘the difficulty of the problem may be affected by the number of 
sites at which we distribute it 


The material is organized as follows. Section 1.2 contains a review of database 
concurrency control, in which the various notions and results in the area are briefly 
described. Chapter 2 consists of the model definition (Section 2.1) and its simple 
properties, Theorems 1 and 2 (Section 2.2). The relation of distributed scheduling 
and games is rigorously established in Chapter 3. An upper bound on the complexity 
of the distributed problem is derived in Section 3.1 (Theorem 3). The games are 
defined in Section 3.2. Chapter 4 is an analysis of the complexity of these games and 
contains the main technical result, the lower bound in Section 4.1 (Theorem 4). The 
consequences of this result on the existence of schedulers are in Section 4.2. Chapter 
5 provides a framework for the study of distributed locking (Section 5.1), and a 
characterization of safe two-transaction systems (Section 5.2), Theorem 5 for 
sufficiency and Theorem 6 for necessity. Finally, Chapter 6 contains the conclusions 
and a list of open problems and directions for further research. 


The material on the model definition (Chapter 2) and distributed locking 
(Chapter 5) represents a joint effort with Prof. C.H. Papadimitriou. Part of this work, | 
namely Chapters 2,3 and 4 appear. in [16]. 
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1.2 A Review of Database Concurrency Control 


A database consists of a set of named data objects called entities. The values of 
these entities must at any time be related in some ways, prescribed by the consistency 
requirements (or integrity constraints) of the database. When a user accesses or 
updates a database, he may have to violate temporarily these consistency 
requirements, in order to restore them at some later time, with the specific data 
changed. For example, in a banking system, there may be no way to transfer funds — 
from an account to another in a single atomic step, without temporarily violating the 
integrity constraint "the sum of all balances equals the total liability of the bank”. 
For this reason, several steps of the interaction of the same user with the database are 
grouped into a transaction. Transactions are assumed to be correct, that is, they are .. 
guaranteed to preserve consistency when run in isolation from other transactions. 


When many transactions access and update the same database concurrently, the 
consistency of the database may fail to be restored after all transactions have 
completed. If, for example, transaction 1 consists of the two steps 


x:=x-100 : 
x:=x+100 


and transaction 2 of the single step x:=1.15 * x, and the consistency requirement is 
simply "x=0", then executing transaction 2 between the two steps of transaction 1 
turns a consistent database into an inconsistent one. This is despite the fact that both 
transactions are individually correct, that is, each preserves database consistency when 
run alone. We must therefore find ways to prevent such undesirable interleaving, 
without excessively harming the average user delay and other measures of the 
efficiency of the system. This is the database concurrency control problem, already 
discussed extensively in the literature (see [37). 


In this section we present a brief (and by no means complete) review of the 
many results on concurrency control. We start by describing the elements of 
mathematical models used to study these problems in the centralized case. This 
setting will help us to present the theory of centralized database concurrency control 
(part-a). We then discuss how distributing the database affects the formulation of the 
problem and describe some of the proposed practical solutions (part-b). 


(a) The centralized case 


Intuitively a database consists of entities and a finite set of transactions. Each 
transaction is a total order on its actions, which are operations performed indivisibly. 
An action p of a transaction T is, in general, an update (i.e., a read and then a write) 
of an entity Xp» based only on the values of entities updated by actions that precede 
this action in the order of. T. 


A history, for a set of transactions T= {T)...T,,}, is a total order representing an 
interleaving of all transaction steps. It is therefore a total order respecting all 
transaction steps. It captures the order of events at the one site, where the database is 
stored. A prefix of a history h is an initial portion of h. H is the set of all histories, 
that is, all interleavings for all sets T of transactions. 


We are interested in correct histories (i.e. histories that take the database from a 
correct initial to a correct final state). A well-known and generally accepted correct 
subset of H is that of serializable histories (SR). A serial history is one with no 
interleaving of actions of different transactions. A history is serializable iff it is 
equivalent (in the obvious schema-theoretic sense with uninterpreted function 
symbols for updates) to some serial history. Since each transaction is by itself correct 
a serializable history is obviously correct. Serializability has been widely recognized 
as the right notion of correctness (e.g., [2,3,4,17,25,31,40]). In fact it is shown in [17] 
that it is the most liberal notion of correctness. possible, when only syntactic 
information (i.e., entity names) is available. 


A scheduler is an algorithm handling incoming requests. It might use a priori 
information (e.g. the syntax of 7) and run time information (e.g., the order of 
incoming requests). The input and output of a scheduler are strings of actions in 7. In 
fact, one is the history of requests and the other the history of their execution. A 
scheduler is said to realize a set of histories C-(where C is a subset of H) if: 
(i) for all inputs, the output is a sequence in C, . 
(ii) for all inputs in C,.the scheduler grants all requests immediately upon receipt. 
This captures the on-line and optimistic features of schedulers [25]. . . 


These sets C were proposed in [25] as a measure, whereby the performance of 
schedulers can be evaluated in a uniform setting. This measure expresses the class of 
all sequences of transaction steps that can be the response of the concurrency 
controller to a stream of. execution requests. The richer this class, the fewer 


unnecessary delays and rearrangements of steps will occur, and the greater the 
parallelism supported by the system. 


A second measure of performance of a scheduler is the computational complexity 
of the decision problems it must solve. 


The area of concurrency control was unified in [25] by formulating the problem 
as a relation between the two performance measures: 7 


CC: The problem of Concurrency Control is, given a set C of correct histories, 
find a scheduler which realizes it and is computationally efficient. 


A basic theorem in [25] is that such a scheduler exists iff the prefixes of C are 
polynomial time recognizable (i.e. in P). : 


The obvious question in this setting is whether an efficient Serializer (i.e. ~ 
scheduler realizing SR) exists. The answer is yes. Testing a history for serializability, | 
or a prefix for whether it has a serializable completion, is an easy task in the 
centralized case. The algorithm is based on conflict graphs. The conflict graph G(Z) 
for a transaction-system 7 is a multigraph, with a node for each transaction in T and 
an edge between T) and T) labeled x, whenever T, and Tz both update entity x. . 
The order of executions of actions in a history assigns directions to the edges of 
G(T). We call this resolving the conflicts between transactions. This result is the 
~ “folk” theorem of concurrency control [2,17,25,28,37]: 


"A history h is serializable iff it resolves conflicts without creating directed cycles | 
in G(7). Similarly, a prefix has a serializable completion iff the already resolved 
conflicts do not create a directed cycle in G(7).” 


The pioneering work in the field was [7], which also introduced concurrency 
control mechanisms such as two phase locking and predicate locks. \t was followed by 
many interesting contributions (e.g. (2,13,31]). A number of concurrency control 
mechanisms were compared in the uniform setting of the parallelism measure C 
introduced. by [25], where CCSR. Moreover it was shown, that if we distinguish 
between read and write actions then deciding whether a history is serializable (i.e. in. 


SR) becomes NP-Complete [25]. 


A very common way for implementing concurrency control is locking. In this 
method each entity is equipped with a binary semaphore (its lock) and transactions 
synchronize their operation by locking and unlocking the entities that they access. In _ 
fact, variants are possible in which locks of different kinds are defined, and certain 
kinds may coexist whereas others may not (e.g. shared or read locks, intention locks 
[13]). The lock-unlock steps are inserted in a transaction according to some /ocking 
policy. A locking policy may have the property that, if all transactions are locked 
according to it, then any execution respecting the locks is guaranteed to be ~ 
serializable. Such a locking policy is called safe. 


Given a transaction system 7, there are certain well-known locking policies that 
can be applied to it. One is the two-phase locking (2PL) policy [7]. In it we insert 
locks surrounding the accesses of all entities, in each transaction subject to the 
following rule: The last entity to be locked is locked before the first entity is 
unlocked. Thus the transaction is divided into two phases: the locking phase, during 
which locks are acquired but not released, and the unlocking phase, in which locks 
are released but not requested. In an extremely conservative interpretation of this 
. policy, we could lock all entities before the first step, and unlock them after the last. 
More reasonably, we could request locks for entities at the first step that they are 
accessed, and release locks at the end of the transaction. In fact, it is shown in [17] ~ 
that the latter interpretation of 2PL is the best possible concurrency control, when 
syntactic information is acquired in an incremental, dynamic manner. It was first 
shown in [7] that’ 2PL is safe (though deadlock-prone). 


If the entities are unstructured (that is, transactions access them in all possible 
patterns) then 2PL is the best possible locking policy. Suppose, however, that the 
entities form a tree, and are accessed by transactions as follows: 

(i) A transaction accesses a subtree, whose root is the first entity to be accessed (after, 
of course, it is locked). . | | 

(ii) After this, when an entity is locked, its parent must be locked and not yet 
unlocked. 
Then this locking policy, called the tree policy is shown in [30] to be both safe and 
deadlock-free. This holds for the more general digraph policy of [39]. In fact, the: 
latter is generalized in [39] to the hypergraph policy which, it is proved, is the most 
general possible safe and deadlock-free policy. 
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Safe locking policies were characterized in [39]. The limitations of the parallelism 
that can be provided by locking were investigated in [26]. Safety of two-transaction 
locked systems can be efficiently decided [21], by employing a geometric 
methodology reminiscent of that used by Dijkstra for studying deadlocks [6]. Besides 
its independent interest and elegance, the two-transaction solution is the building 
block for resolving the general case. It turns out that a locking policy defined on d>2 
transactions is safe iff all of its two-transaction subsystems are safe, plus a 
combinatorial condition. This combinatorial condition turns out to be NP-Complete, 
but it is simple enough to have some interesting corollaries. For example, all specific 
locking policies mentioned above can be shown to be safe as immediate 
consequences of the condition. : : 


(b) The distributed case 

The assumption that the database is stored at one site is not always true.. 
Distributing the database among various sites might be necessary and even desirable. 
In fact the current trend in technology is towards distributed databases 
[2,3,4,28,35,36]. | 


In a distributed environment the transactions, histories and prefixes become 
partial orders and the scheduler consists of many communicating sequential 
processes, one at each site. The model presented in Chapter 2 abstracts the relevant 
properties of transactions, actions, histories, prefixes, and schedulers. It extends the 
parallelism measure of schedulers, the concept of serializability and conflict graphs to 
the distributed case. The new elements are, that the scheduler uses message passing 
between sites and that the conflicts are partitioned into the conflicts at every site. ‘The 
problem of Distributed Concurrency Control (DCC) can be formalized as was that of 
Concurrency Control (CC). A rigorous treatment of this problem will require the 
selection of a formal system, in which to express distributed algorithms e.g. [9,15,24]. 
Such a system, with the least possible restrictions, is selected in the next chapter. 


The problem of concurrency control has been examined by designers of 

distributed databases and various solutions have been proposed. Because of other 
. important considerations in a distributed environment, concurrency control is 
viewed (and rightly so) as only one of a number of goals of such systeras (e.g. other 
problenis are, optimal partitioning of the database, distributed query processing [12], 


1] 


properties of the communication medium, importance of deadlocks between sites. 
[22,23], reliability of updates [14]). What is not clear from these involved distributed 
algorithms is, whether the distributed version of concurrency control, by itself, is a 
more complex task than its centralized version. This in fact is the subject of the 
present study. 


A survey of distributed database concurrency control algorithms is contained in 
[4]. These algorithms are classified into methods using transaction timestamps to 
. resolve conflicts [19] and methods using locking (particularly the two phase locking 
rule) [7]. The methods are compared on the basis of the three measures indicated in 
Section 1.1 (i.e. parallelism, complexity, communication), with an additional 
distinction between. delaying or aborting requests that cannot be safely granted. 
Another issue that is investigated is the effect of having conflicts between read and 
write actions or write and write actions. There are methods, which cannot be 
classified into this timestamp v.s. locking scheme (e.g. voting methods used in [36]). 
- There are also experimental comparative studies [10,20]. 


A concurrency control method, which stands out among all these algorithms is 
that employed by SDD-1 [2,3]. The reason for this is its preanalysis of a-priori 
information (i.e., the structure of the conflict graph) in order to enhance parallelism. 
An obvious question is, why should not a similar preanalysis be used to enhance the 
communication between the processes of the scheduler. 


Finally let us mention a new research direction, which developed from the 
distributed problem, but is important even for the centralized case. It is tacitly 
assumed that there is one version of each entity in the database and an update 
creates a new version making the old one obsolete. It might be possible to use older 
versions in addition to the conflict graph, in order to perform concurrency control. 
This is done by changing the semantics of “read” and "write" (e.g., Reed’s rule [27], 
before-and-after values [32]). This change in the model can have profound 
_ consequences, since it introduces a space-parallelism tradeoff (i.e., by using more . 
versions the sets of interleavings C that can be realized by schedulers can be 


enriched). 


12. 


2 A Model of Distributed Database Concurrency Control 


This chapter contains the definition of our model for distributed database 
concurrency control. This model generalizes the centralized model, is simple and can 
be used for the analysis of all practical solutions proposed to date. | 


2.1 Model Definition 


A distributed database is a collection of sites. Each site has its own processor and 
data. The sites are interconnected by a network and are controled by a distributed 
database management system (DDBMS). In Fig. 2.1 we show the architecture of a 2- 
site system; horizontal arrows join modules of the same distributed process. 
Formally, a distributed database is defined as follows: 


Definition 1: A Distributed Database Design (DDD) is a quadruple <Gp. Data, 


- Stored-at, IC> where: 


() Gp=(V.E) is a graph, hei every node bearers to a site and every link 
to a two-way communication link between sites. 

(ii) Data is a set of variables (or entities), denoted {x,y,z,...} 
(ie. physical data items). 

(iii) Stored-at : Data + V is a function that determines the site, where each 
physical data item is stored. 

(iv) IC is a set of integrity constraints on the values of the Data. 


Note that nmultiple copies of the same /ogical data item are considered as different 
physical data items stored at different sites. The fact that they are copies and must 
remain identical for reasons of consistency is part of the antegaty constraints, and is 
not treated separately. 


The users interact with the database using transactions. In our model a transaction 
is a distributed program, not identified with a particular site. 
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Definition 2: A transaction T, in a given DDD, is a directed acyclic graph (dag) 
T=(N,A) such that: | 

(i) every node p is associated with one site of the system, site(p) and with an 
entity Xp stored at that site. — 

(ii) all nodes associated with the same site are totally ordered in A. 
A transaction system 7 is a set of transactions {T;}.0 


Note that it is assumed that transactions are correct programs (e.g. update all 
copies of the same logical item in order to preserve the integrity of the database). We 
denote the partial order imposed by a transaction Tj on its actions as >7;. 


Definition 3: The nodes of a transaction are the actions performed by the 
transaction. The semantics of an action p is the indivisible execution of the following 
two steps 

c= Xx 

Xp i= pl trmstign..) where q ranges over all actions that are ancestors of p in 
the transaction of p. _ 

Here the t’s are temporaries (i.e., a workspace local to the transaction) and the 
x's are physical items in the database. The fj’s are uninterpreted function symbols.O 


Hence the nodes of transactions stand for indivisible actions. We do not specify 
the details of the exact nature of the computation performed by each action. Instead 
we view an action p of a transaction T as an uninterpreted function symbol fp» with 
one output and |{ql q >; p}l+1 inputs. The transactions are in fact program 
schemata, where all updates are treated by the concurrency control mechanism as 
uninterpreted updates. Designing the database (i.e., deciding how many copies of 
each item there are and where they are stored) and writing correct transactions (e.g., 
which copies to update, which other integrity constraints to satisfy) are problems at a 
higher level than concurrency control, and are not treated here. 


2|-—-/ 2 
: 3 


Figure 2.1 System Architecture 


Figure 2.2 Transactions 
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The particular model of actions used was chosen for its clarity. Other models, 
such as those illustrated in examples 2 and 3 below could as well have been used, to 
produce results similar to those of Chapters 3 and 4. 


Example 1: Consider the transaction of Fig. 2.2(a). Actions 1,2,3 are performed 
at site 1, actions 4,5 at site 2, and 6 at site 3. The actions performed at the same site 
are totally ordered. The actions are updates as in Definition 3, so every node can be 
associated with a variable and the site this variable is stored at. This model 
generalizes the centralized model of [17]. 


Example 2: Caister the transaction of Fig. 2.2(b) with actions qd, », (3,4),(5,6) 
performed respectively at sites 1,2,3. If p is odd it is a read action with a readset of 
data items stored at its site. If it is even it is a write action with a writeset instead, and 
- this update depends on all readsets (e.g., action 6 has writeset wx, y] and depends 
on readsets R1liw], R3fu SV], R[x], where w is stored at 1, u, v at 2, and x,y at 3). This 
type of actions and transaction is used in SDD-1 [2,3]. 


Example 3: Consider the transaction of Fig. 2.2(c), where action j is performed _ 
at site j (there is only one action per site). Dataset(j), of arbitrary cardinality, is 
updated based on its previous values and those of datasets of ancestor actions. This is 
a very simple model that makes the centralized version trivial (a transaction is an 
action), yet it presents interesting problems in the distributed case. . 


An edge in a transaction T between actions at different sites (called a cross-edge), 
denotes both temporal precedence and a transfer of information (i.e., in Fig. 2.2(a) 
update 5 needs data from update 1). These cross-edges ontepane to user-defined 
messages, which the. system must service. 


A history is a description of a set of transactions and the process of their 
execution on the system. In a distributed system [19] it is in general impossible to tell 
which one of two events occured first, (because communication is not always 
instantaneous). Because of this uncertainty; we describe the execution order of the ~ 
actions by a partial order. If two events are incomparable in this partial order, any 
one could have preceded the other. There are two restrictions on the partial orders. 
First, what happens at every site is totally ordered; this is consistent. with the 
centralized problem and guarantees that the result of the execution is uniquely - 
determined as in the case of individual transactions. Second, user-specified 
precedences are always respected. Formally: 
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Definition 4: A history is a pair <7,7>, where T={ T; ,1<i<m} is a transaction 
system and w is a directed acyclic graph (dag) on the nodes of the transactions Tj 
such that: 

(i) Nodes p with the same site(p) are totally ordered. . 

(ii) For any transaction T; and actions p,q€ T; and p >+; q we have that p>,.q 
(where >, denotes the partial order imposed by »). 0 


Definition 5: A prefix of a history h=<Tx> is a pair <7,a>, where a is the 
induced subgraph of # by a subset of its nodes such that, if action p€a all ancestors q 
of p in » belong to a0 | 


A history may be viewed as a special case of a parallel program schema (see Fig. 
2.3). The resulting schemata and the rigorous treatment of their equivalence under 
Herbrand interpretation [25] closely resemble the centralized case. | 


Definition 6 :Two histories hy =<T,r}> and hy=<T,w> are equivalent (hh) 
iff their schemata are strongly equivalent (that is equivalent under the Herbrand 
interpretation of the function symbols and variables).O 


Let H denote the set of all histories. Recall that a partial order can be considered _ 
as a set of total orders (those compatible with it). Let H*denote the set of all 
histories <7,2>, where = is a total order. Therefore a history represents a particular — 
-subset of this basic set H+. The histories with only transaction-defined cross-edges 
(arcs between actions at different sites) are maximal when considered as sets of total 
orders. Yet histories can have other cross-edges also (e.g.,arc (4,6) in Fig.2.3), whose 
presence restricts the allowable total interleavings of actions. The goal of concurrency 
control is to recognize on-line large sets of correct total interleavings. 
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Since individual transactions are correct (i.e., take the database from a correct 
initial to a correct final state), histories in which transactions are executed one after 
the other (serial histories) are correct. Also those histories that are equivalent to 
them, called serializable, are correct. We denote the set of serializable histories by SR 


( SRCH). 


Definition 7: A history h is serial iff 

(i) The execution of actions at each site introduces a total order of transactions at 
that site (i.e. there are no transactions Tj,Tj ij with actions p,g€ Tj, ré Tj eo 
at the same site with p preceding r ad r preceding q). 

(ii) If Tj precedes Tj at one site it does so at all sites, where both transactions 
-have actions.0 


Definition 8: A history is serializable iff it is equivalent to a serial history.O 


In the next section we will show that deciding serializability is an easy task. This 
task becomes NP-Complete if the modcl with read and write actions (instead of 
updates) is used [25]. Even in that case SR has interesting efficiently recognizable 
subsets (ie, DSR[25]). What is significant, is that deciding whether a history is 
serializable in a centralized or distributed model are practically identical tasks. We 
discuss this similarity in the next section. 


As in the centralized case, synchronization is necessary only between actions of a 
transaction system which operate on the same data (e., conflict). These conflicts are 
represented by the conflict graph G(7). 
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Definition 9: For the transaction system T={Tj, 1<i<m}, the conflict graph 
G(7) is an undirected multigraph (V,E), with a partial order >; associated to the 
edges incident upon each node i, such that: 

(a) V={i] 1<i<m}, with node i corresponding to transaction Tj. 

(b) E is a multiset of edges. E={ copies of edge ij | for every copy of ij there is a 
distinct pair of actions {p,q} with p€Tj;, q¢T;, i+j and Xp=Xqh 

(c) For two edges incident at node i we have ij >; ik iff the action in Tj 
corresponding to ij is identical to or precedes the action in T; corresponding to ik. 


Note that an edge in E denotes a conflict between two transactions. Every edge ij 
in E corresponds to a pair of actions {p,q}, which update the same variable. Based 
on where this variable is stored we can partition E into as many multisets as there are - 
sites (e.g., “red” and “green” edges for two sites). For an example see Fig. 2.4. 


An ordered mixed multigraph G=(V,E,A,{>;}) is a mixed multigraph with E a 
multiset of edges, A a multiset of directed edges and a partial order >; at each node i 
of the edges incident at the node. Conflict graphs are such objects with A=@. 


Since a conflict (or an edge in G(7)) corresponds to two actions at the same site. 
and a history h=<T.o> has a total order of the actions at each site, we can say that a 
history resolves all conflicts. That is, if edge ij corresponds to the pair of actions 
{p.q}, peTj, q€T;, i#j, we direct ij from i to j iff p>,q. | 


Definition 10: A prefix <7,«> of a history assigns a direction (ij) to an edge ij of 
the conflict graph G(Z) iff all histories, which have <7,«> as prefix, assign ij the 
direction (ij). Thus .a prefix <7,a> determines an assignment of directions to some 
edges of the conflict graph. 

Conversely an assignment of directions to edges of the conflict graph is 
realizable by a prefix, if there is a prefix of a history assigning these directions and 
no others.O 


Thus a prefix <7,a> determines a unique ordered mixed multigraph G7), ~ 
which is G(7) with some of its edges directed. 
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Up until now the. distributed problem appears to be a straight-forward 
generalization of the centralized case. What is considerably more complex in the 
distributed case is the subject of schedulers, and their design to meet performance 
specifications. For an exposition of the relatively simple theory for the centralized 
case see [25]. . 


Our schedulers will be distributed algorithms characterized by the parallelism 
they provide and by their efficiency. We will measure parallelism using sets of 
histories C, that is subsets of H. The efficiency of the schedulers will be measured by 
the worst-case number of steps they execute and the worst-case number of messages 
they use. We will be interested in the following special C’s: 


Definition 11: Consider a set of histories CCH, such that for each h€C the only 
cross-edges (edges between actions at different sites) are defined by the transactions. 
Such a C we shall call a concurrency control principle.o 


C is chosen in such a way, that all h€C are correct. The larger C is, the higher 
the level of parallelism supported by this concurrency control principle. Examples of 
concurrency control principles are serializability and serial (one-at-a-time) execution . 
Obviously, the former supports more parallelism. Thus concurrency control 
principles are very natural classes of histories measuring parallelism, although not all 
subsets of H can be expressed as such. 


A scheduler Q is a distributed algorithm. (We do not explicitly specify the 
model of computation, although we shall use a concurrent language notation as 
needed). It consists of a set of communicating sequential processes [15], one for 
each site. Its instructions may involve the following: 

1) Local Computation : 

2) Receiving an execution request for an action q. 

3) Granting an execution request of an action q. 

4) Sending a message to another site (i.e. send(<message>)) 

5) Receiving a message from another site 
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Each history h corresponds to a set {h*+}of fotal orders (those that do not 
contradict h). Let h+ denote any total order which respects the partial order of 
history h. If C is a set of histories, we let Ct ={ht heC}. H is the set of all 
histories. An element of H* is a string, that is, a mapping from {1,2,...,.n} to N, 
where N is the set of all actions and |N|=n. In fact it is a pair <7, string>, but we 
omit 7 when it is obvious from the context. The jth symbol of ht e€Htis 
denoted by hjt. 

We thus assume that there is a total order on the arriving execution requests. 
This is a simplifying analytical tool (a formalism of the familiar notion of a 
timestamp ) and is not used by the scheduler, whose processes still perceive the 
world in terms of partial orders. We therefore have a global clock, whose ticks 
are the arrivals of execution requests. This sequence of execution requests is the 
input of the scheduler. What is the output of a scheduler? It cannot be just a 
sequence of actions, as the relative ordering of the granting of requests with 
respect to their arrival is also important. The output of the scheduler is an n- 
tuple of strings S=(s},52,....8n)€(N*)". Heres; denotes the sequence of granted 
requests between the jth and (j+1)-st (after the jth if j=n) arrivals of requests. . 
N” is the set of all strings constructed. from the set of actions N and includes the 
empty string. The concatenation of the n strings, conc(S), should be in Ht. 

Thus a scheduler Q, besides being a distributed algorithm, is a 
nondeterministic mapping, (i.e. a set of mappings) from H+ to (N*)®. 

For each total order h+, Q will produce a stream S of granted requests; one 
nondeterministic element is that of the various communication delays. A set of 
communication delays is a function d, which assigns to each execution of a send 
instruction by a process of Q a nonnegative real number. Not all functions are 
delay functions. The delay function has to be feasible, in that an action p must 
be executed before a successor q of p, in its transaction, can be requested. Note 
that the zero function d=0 is always a feasible delay function. Therefore the 
mapping Qg:H* +(N*)9 is well-defined for each feasible delay function d, ~ 
assuming that local computation proceeds at a rate far faster than the arrival of | 
requests and messages.O 
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Consider a set of histories CCH. Scheduler Q realizes C if all outputs of Q are 
in C- and thus presumably correct- and, furthermore, if Q is fed with a history in C 
and all delays are 0, then Q grants all requests without making them wait. It is 
argued in [25] that these are traits, in the centralized case, of all schedulers that are 
on-line and optimistic (two intuitive properties shared by all existing schedulers). The 
same arguments are applicable to justify Definition 12, where total orders and strings 
of actions are used to formalize this intuition. 


Each process makes decisions about whether to grant or delay pending requests. 
These decisions can only depend on the information available to each process 
(i.e., T and the requests that it knows have been granted or are pending). This can be ~ 
viewed as a consequence of-the power of the set of instructions used (see above). 


Definition 12:-We say that Q is a realization of C iff 
(a) conc(Qg(ht))eC* for all h€H, and delay functions d. 
(b) Qo(h*)=(hy F,...hy +) for all heC.a 


We illustrate the above definition in Fig.2.5. If ht+¢€H™ is the input to Q there 
are many possible computation paths (i.e., sequences of events in the system). This is 
because of the essentially random delivery time of the messages. So every path has 
associated with it the delays of messages used along this path and has output 
(5},8),.S_)€(N’)4. The conditions are that the granted requests always form a 
correct history (a) and, moreover, if requested actions form a correct history and all 
delays are zero, then the requests must be granted immediately (b). These conditions 
must hold for all computation paths. So there is a difference between the use of the 
term nondeterminism above and that of classical complexity theory. 


There also is a feedback effect from output to input (ice., requests cannot be 
made if their ancestors in transactions have not been granted). This problem; which 
is due to our choice of an input-output description could restrict the set of inputs toa . 
particular scheduler. Yet all prefixes of histories in C must still be inputs to all 
_ schedulers realizing C. This is also true for all prefixes not in C that are minimal 
(their prefixes are in C). These will be the only inputs of interest in Theorem 3. 
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Definition 13: The computational complexity of Q is the worst-case sum of the 
counts of all local computations by Q over all processes of Q. The communication 
complexity of Q is the worst-case count of all send instructions executed by all 
processes of Q.0 | 


Note that apart from the messages generated by the scheduler processes of the 
system there is also user defined communication, implied by transaction cross-edges 
(e.g. some action at site 2 needs data from site 1). This communication is assumed 
Sree, since it is unavoidable, and can be used to pass information between scheduler 
processes at no cost. 


A scheduler Q is polynomial time bounded (or computationally efficient) if its 
computational complexity is bounded by a polynomial in n (i.e., n={N], N is the set 
of actions of 7). This means that ail possible computation paths have computational 
complexity (number of local steps) bounded by a polynomial in n. | 
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We may even augment the computation power of our schedulers if we allow 
them, in their local computation steps, to consult an oracle [11] for a hard 
computational problem (say an NP-Complete en Many of our results will still 
hold for such schedulers. 


Finally in order to characterize communication complexity we define the 
following classes M,(b): ) 


Definition 14: For a prefix <a> of C and an integer b>0 we say that : 
<T,a>€M {b) if there is a realization Q of C such that the total sum of send 
instructions executed at all processes of Q after <T,a> is b or less. 

Let b*(7Z) be the least b for which <7,9>€M(b). A scheduler which achieves 
b*(7), for every 7, is called communication-optimaLO 


Note that M({b)=@ if b<0 and A1(b)CM/{b+ 1). This definition describes the 
communication used if both processes of the scheduler are started with initial 
information <T,a>. : | | 


What Definition 14 says is that a priori information about the syntax of the 
transactions could be used to enhance the communication performance (worst-case 
number of messages used at run time) of the concurrency control mechanism. This is 
analogous to the conflict graph analysis used to improve parallelism in SDD-1 [2,3]. 
A communication optimal scheduler is the limit i in message performance attainable, 
subject to a parallelism requirement Cc 


In Section 2.2 we will show that our model is a simple generalization of the 
centralized case and that there exists a computationally efficient scheduler realizing 
SR. In Chapter 3 we will recursively characterize the classes M(b) and prove that - 
there exists a communication optimal scheduler realizing SR. Finally in Chapter 4 we 
will examine the complexity of deciding whether a prefix is in Msp(b) and prove 
that, if NP#PSPACE, no scheduler can realize SR and be both computationally 
efficient and communication optimal. This will be true even if we restrict our system 
to two sites, and our transactions to sequences of six updates each. 
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2.2 Properties and Limitations of the Model 


The model presented in Section 2.1 consisted of extending the definitions of 
centralized concurrency control by introducing, where necessary, partial orders 
instead of total orders and by partitioning the conflicts according to sites. A more 
technical part was involved with defining the class of allowable distributed | 
schedulers. We can now state the distributed problem we will examine: 


DCC: The problem of Distributed Concurrency Control is, given a set of . 
histories C (which we can prove correct), find a scheduler, which realizes C and is 
efficient (in terms of both local computation and communication). 


Similarly to [25] we can prove: 


Theorem 1: C has a computationally efficient realization iff the set of prefixes of 
C is in P (ie. deterministic polynomial time). . 


Proof: Since we can expend an indefinite amount of communication between 
the different modules of a scheduler, the problem reduces to the centralized one (one 
site gathers all information and makes all decisions). Therefore the constructive ptoof 
of [25] is applicable. For arbitrary delays this construction gives us outputs in Ct; 
for 0 delays Definition 12(b) is also satisfied.O 


Since the analysis we will be presenting deals primarily with the assignment of 
directions to edges of the conflict graph G(7) by a prefix <T,a>, we need a 
characterization of realizable assignments (see Definition 10) 
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Lemma _1: Given a conflict graph G(7)=(V,E.9,{>;}). An assignment of | 
directions to a multiset X of its edges, producing the ordered mixed multigraph 
(V,E\X,Ax.{>;}) is realizable iff, 

(a) If ij € X and is directed from i to j and ik >; ij then ik € X. 

(b) Ax has no directed cycles (iji713...ipiy) such that: 

iin Sia 123, i213 Big igi. inh 2a tid. 

Proof: “only if’ Given a prefix <T,a> of a history Jet us first assign the direction 
(ij) to any edge ij in G(7Z), which corresponds to a pair of conflicting actions {p,q}, 
under the following conditions: 
() peTj, a€T; | 


(2) pea 
(3) if q€a then p>,d 


Obviously all histories, which have <7,«> as prefix resolve these conflicts in the 
same way. Moreover if an edge -has not been given a direction then both its actions 
p’.q’ are not in a. We can complete <7,a> with suffixes of histories that have p’,q’ in’ 
both orders. This proves that the directions we have constructed are may those 
assigned by <7ia@>. - 


Bevhiise of causality both conditions (a) and (b) obviously hold for the directions . 
constructed above. 


“if? Given an assignment Ay we construct the following digraph (Vo,Ag) 


Vo (vertex set): 
If (ij)€EAx and ij corresponds to conflicting actions {p,q}, eT; then p€Vo. 
If p€Vo, p€Tj; then all ancestors of p in Tj belong to Vg. 


Ag (are set): 
If p,q belong to the same Tj and p> ,q then (pq)€Ap. 


If p,q correspond to an (ij)€Ax then (pq)€Ap. 


Since (b) is true (Vp,Ag) is acyclic and since (a) is true transaction precedences 
are respected. Thus (Vg,Ag) has the same nodes as some prefix and respects all its 
conflict resolving orderings ( see “only if" part of the proof). By topologically sorting 
the nodes of (Vg,Ag) we can produce the desired prefix.0. | 
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We will now characterize the serializable histories and prove that the prefixes of 
SR are polynomially recognizable (in P). 


For the model of actions we are using (i.e., tp: =Xp; Xp: = fpltpr-atgs--)) we say 
that action p reads a variable x from q in history h=<T7,a>, if Xp=Xq=X and q is the 
ancestor of p closest to pin w. The reads x from relation in our model is always a 
chain of all actions p, for which Xp=X. The chains for all x’s give us the reads-from 
relation. It is easy to see that we can represent the reads-from relation for a given 
history h=<7,2> as a directed multigraph D(h), with nodes corresponding to 
transactions and edges corresponding to edges of these chains (labelled by the - 
variable read and the action reading it). In D(h) we can Or arcs of the form G, i) 
because we can deduce these from 7. 


. Since histories are program schemata, we have from standard schemata 
equivalence theory [25]: a 


Proposition 1: Two histories hy=<T,#}> and hy=<T.w2> are equivalent iff» 
D(hj)=D(h) (ie, they have the same actions and the same reads-from relation).O 


For other models of actions it is necessary to distinguish between live and dead 
transactions [25]. In our model, all transactions are live. Obviously for a serial history 


hy, Dh) is acyclic. 
The following theorem (an obvious generalization of the centralized case) is yet | 
another variant of a veritable “folk” theorem [3,17,25,28,40]: 


- Theorem 2: A history h is serializable iff it resolves conflicts without creating 
| directed cycles in G(7). Similarly, a prefix has a serializable completion iff the 
already resolved conflicts do not create a directed cycle in G(7). 


Proof: Let Dh) represent the reads-from relation for h. If h=hg for hg serial 
then D(h)=D(hy), which is acyclic. If D(h) is acyclic we can find a total order of — 
transactions by topologically sorting it and then consider the serial history which 
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respects this total order on all processors. This serial history has the same D(h). The 
only difference from the centralized case is that D(h) can be partitioned into as many 
subdags as there are sites. . ' 


It is easy to see that D(h) is acyclic iff G(7) with the assigned directions is 
acyclic. A scheduler, recognizing serializable interleavings and knowing of all 
requests (operating in a centralized manner), would arbitrate requests on-line by 
making sure that the assignment of directions to the conflict graph introduces no 
directed cycles. This can be done in polynomial time. Therefore there is a 
computationally efficient scheduler realizing SR.O 


_It is easily seen from the above analysis that histories with the same total orders 
on each site are equivalent, and cross-edges are not needed. for deciding 
serializability. These edges, between actions. at different sites, can be used in relating 
histories and performance of distributed schedulers. 


Let us end this Chapter with a brief discussion on the properties of our model. 
The advantages of this model are:. . 


(a) generality: All models of transactions and schedulers proposed have the 
properties of our model. Variations in the format of transactions (i.e. defining 
‘separate read and write actions) do not affect the results that will be presented. 


(b)_ mathematical simplicitly: All cases are treated uniformly (i.e. copy 
equivalence is just one more instance of the integrity constraints). All questions are 
reduced to questions on concrete combinatorial objects (e.g. conflict graphs). There 
are no hidden assumptions since the performance measures (parallelism, computation 
steps, messages) and the model of distributed algorithms are well-defined. 


(c) compatibility: The model is an extension of the centralized case. In Section 
5.1 we will be able to express distributed locking policies in the model, just as was 
done in the centralized case. | 


(d) correctness: Serializability is not the only notion of correctness, but it is 
certainly the most generally accepted one. It is intimately related to the a priori 
information about the syntax of 7. . 
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On the other hand there are some disadvantages: 


e) Restricting attention to the three measures of performance: We ignore goals 
which are important for distibuted systems but hard to treat mathematically (e.g. 
reliability of the update mechanism, which is usually handled by two phase commit 

protocols[14]). 


(f) The assumption that all syntactic information is known at run time: 


Information about transactions is not always available before the transaction is 
initiated. There is a. whole spectrum of possibilities, between total syntactic 
information being known before run time (static case) and the completely dynamic 
case, in which information is acnuee for each action separately as it is presented for 
execution. 


(g) The measure of parallelism used 1 (ie. the size of the ‘Set a is a crude | 
approximation of the average user delay [25]. 


These disadvantages are . shared by most formal work on database concurrency 
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3. Communication-Optimal Schedulers and Games 


We will now state and prove a theorem, which relates the structure of histories 
and their prefixes with the number of messages necessary and | sufficient to achieve a 
performance C. ; 


3.1 A Recursive Characterization of Communication Complexity. 


As defined in Section 2.1 the performance measure for ‘parallelism C is a set of 
histories (i.e. CCH). In this section we require C to be a concurrency control principle 
(see Definition 11). Concurrency control principles are very natural classes of — 
' histories measuring parallelism (examples are serializability SR, and serial execution 
S). Let PR(C) be the set of prefixes of histories in C. Two properties of C are used in 
our recursive characterization of communication complexity. First, if C is a 
concurrency control principle, then for each. h¢ .C the only cross-edges (edges 
between actions at different sites) are defined by the transactions, Second, we have 
an efficient (polynomial time in n) test of membership of a prefix in PR(C) (for 
example, if C=SR Theorem 2 provides us with such a test). If no such test is 
possible, concurrency control is quite hopeless, even in the centralized case [25]. 


Let us briefly review the notation used. A prefix is denoted as a pair <7,a>, 
where 7 represents the transactions (a priori syntactic information) and a the order 
in which some actions were executed. We use'a for <Z,«> when there is no 
ambiguity about T. Also (8/a); denotes the prefix of g that contains « and all actions 
of f at site i (the projection of f at site i given a). So « is a prefix of g and (B/a);. 
Finally we use M,(b), where Af,(b)¢ PR(C), for the set of all prefixes <T,a> of C 
such that there is a realization of C which, when started with <T,a>, sends b or fewer 
messages, | 
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Theorem 3: Let C be a concurrency control principle, <7,a> a prefix in PR(C), | 
and b a nonegative integer. Let i denote an index ranging over the site number 
i€ {1,2}. Then the following are equivalent: 


(I) <a> € Mb) 


(1) <> € PR(C) (3) vi <T(B/a),> € M¢(b) 
(I) v<7g> if} : then | 
(2) Vi<7.(@/a);> € PR(C) 4) ai <T(B/a)> € M{b-2)0 


Less formally (II) reads as follows: 
"For all continuations .aj,a2 Of a such that a, is a with some actions at site 1 added, 
and a) is a with some actions at site 2 added, and such that their least common 
continuation g is not a prefix of C (while a},a, are) the following holds: 
<T,ap><T,a2>€ Afb) and one of them is in M,{b-2).” : 


We will first give an intuitive interpretation of Theorem 3 (which is illustrated in 
Fig. 3.1). Consider a scheduler, which realizes C, starts from <7,a> and receives input 
requests <7,8>. Each one of the scheduler processes i, i€{1,2}, can see (8/a);, without 
sending any messages. This is because process i (e.g. process 1 in Fig. 3.1), knows a 
(e.g. @ in Fig. 3.1), receives the actions of to be executed at site i (e.g. actions 4 and . 
5 in Fig. 3.1) and using the transaction-defined messages (e.g. action 5 needs data . 
from action 6 in Fig. 3.1) can learn about some actions at the other site (e.g. actions 6 
and 8 in Fig. 3.1). 


_A Situation that forces communication is one where the projections of the input, 
that each process sees directly, seem correct (i.e. <7,(A/a),> € PR(C)) and therefore 
must be executed on-line to achieve the goal C, yet the real input could be incorrect 
(ie. <g> € PR(C)). For the example in Fig. 3.1, a=@ and there is a unique 
. Minimal "bad" continuation B. We use a; as a shorthand for (8/a);, when there is no 

ambiguity. 


Theorem 3 tells us that these are the only cases for which we need 
communication between scheduler processes; furthermore to guard against such 
“bad” p's only one (f/a); (say (B/a),* or a+ for short) has to be in M,(b-2). The 
communication protocol is built in such a way, that the corresponding site i* will ask 
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for the approval of the other site in order to execute aj+. There is therefore a 
balancing of the send instructions among the two processes of the scheduler, with 
each send instruction guarding against a "bad" £. 


The rigorous proof of Theorem 3 is given below. In one direction it entails an 
adversary argument and case analysis. For the other direction we give an explicit 
recursive construction of scheduler processes that realize C, within the prescribed © 
number of messages. The basic idea of this construction is the following: Let a},0 be 
correct continuations of « and projections of an incorrect g. Let Q; (i=1,2) be a 
message-optimal protocol, given that.a; has been executed. Then the Q,’s can be 
combined to produce a Q that is message optimal, given that a has been executed. If 
Q; uses more messages than Q; then the process of Q at site j will have the send 
instruction guarding against ~. 


©) 


Figure 3.1 
(a) Transaction system (u,v,w at site 1, x,y,z at site 2) (e.g. action 1 updates x) 
(b) Conflict graph (i.e. ---- =conflicts at site 1, ——- . = conflicts at site 2) 
(c) Illustrating Theorem 3. Above: prefixes. Below: assignments of directions. 
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Proof of Theorem: Let a; denote (8/a);. Theorem 3 recursively characterizes 
<T,a> € M,(b), based on prefixes <7,a)>, <T,a>>, which properly contain <7,a>. The 
containement is proper because of conditions (1),(2) of the Theorem. The last actions 
of <7,g> at sites 1 and 2 (p, and p, respectively) are concurrent and not contained in 
a, Consequently a, containing p; and a containing p, are not prefixes of each other. 
Note that in order to terminate the recursion we use the following facts: if b<0 then 
M {b)=@ and if h is a history in C then he Af,(0). For b=0 the statement of the 
theorem becomes: "<T,a> € Md{0) iff no prefixes <7,g> exist satisfying conditions 
(1),(2) and (3)". 


"I=>II" We will now prove that if 8 exists with properties (1),(2) and (3) and 
{<Tap> € MLb)} V {<T.ay> € Mfd)}. V {<Tiay> € M{b-2) A<T,a> ¢€ MLb-2)} 
then <T\a> ¢ M{b). This is obvious if one of the two fiest: members of the above or 
clause is true. If both are false but the third member is true we-will prove that 
communication involving two messages is forced, between the execution of <7,a> 
and that of <T,e)> or <7,e3> for all schedulers reatizing C. For this we will use the 
general specifications for a programming systent ‘as outlined in Section 2.1. 


Consider the following situation that the process of the scheduler at site 1 (site 2) 
can face. It receives request p)(p,); white: knowing that certain requests <7,y> have 
been granted with y=a)-{p)} (y=a7-{p>}). It has to’ decide whether to grant or 
delay p,(p4). If it grants the request, then. according to its local view of the input the 
result would be correct. Its local view of the input can be the actual input, that is it 
could be the case that the input history is in C ithas <7, ,a,> (<T,ay>) as a prefix, and 
no other requests have been submitted at the other site yet. Therefore the scheduler 
cannot delay p,(p3) for the purpose of waiting: for some future request submitted at 
site 1 (site 2). It has the following two options. First, the process of the scheduler at . 
- site 1 (site 2) can either grant p; (p2) directly or after receiving a message from the . 
process at the. other site. Second, it can inform the other site of p;(p,) or it can 
withhold that information. ‘These two options expresséd' as sets of instructions in our 
programming system give risé to the only four possible cases for site 1 (site 2) to 
handle p,(p,). These are cases Al-A4 (cases B1-B4 are symmetric). | 
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Case Al: if (input as seen at site 1 is in PR(C)) then grant p, 


In this case the process at site 1 does not wait or inform site 2 of its 
decision. 


-Case A2: if Gaput as seen at site 1 is in PR(C)) then grant p, 
send (message to site 2) 


In this case the process at site 1 does not wait but informs site 2 of its 
decision. The message can potentially contain all available information at site 
1. The order of these instructions can be interchanged. 


Case A3: wait (for message from site 2) 
if (input as seen at site 1 is in PR(C)) then stank P} 


In this case the process at site 1 waits for information from site 2, but 
does not send any information. Interchanging the order of these steps will be 
treated similarly with case Al. 


Case A4: send (message to site 2) 
wait (for message from site 2) 
if (input as seen at site 1 is in PR(C)) then grant p, 


In this case the process at site 1 informs site 2 of its problem, and waits 
for an answer before ne Any permauon of me Steps also uses two 


messages. 
We will now reach a contradiction by examining two possibilities. | 


(i) If either the process at site 1 uses the instructions of case A4 or the process at 
Site 2 uses the instructions of case B4 then two messages are consumed in executing 
either <7,a)> or <T,a2>. Since we assume these prefixes belong to M,(b) and not to 
M{b-2) and they are prefixes of <7,a>,.we will have to use (b-1)+2=b+Db 
messages at least to achieve our per eance goals starting from <T7a>. 


(ii) For all other combinations of cases of instructions we will also find 
contradictions. . : 

Using case Ai instructions for site 1 and case Bj instructions for site 2, for 
Ajé {1,2}, we obviously have situations where the input prefix is <7,g>¢PR(C) and 
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is (incorrectly) executed sainout rearranging requests. 

In any one of the remaining combinations either site 1 uses instructions of case 
A3 or site 2 uses instructions of case B3. We will reach a contradiction using A3 
instructions (B3 is symmetric). Let the input history h* be in C and have <T,e)> as 
prefix. When the request for p, will be submitted to the process of the scheduler at 
site 1, the process will wait for a message from the other site, which will determine its 
decision ( granting p, or making it wait for future requests from other transactions). 
But when actions of <7,«)> are being executed at site 2 no such message can be sent. 
This is because according to site 2 both <T,a,> and <T,e>> are possible (proper) 
continuations and decisions cannot be made excluding. one or the other. So the. 
message site 1 is waiting for will be sent when descendants of <7,a)> arrive at site 2. 
Thus-we force action p, to wait for some action which i is not its ancestor in h*, and 
- therefore h*, although in C, is not executed on-line as required by Definition 12 of 
Section 2.1. 


"II=>I": Under the conditions of the theorem we will construct a realization of 
C achieving the desired performance. That is we will present a scheduler, which will 
consist of two processes (i.e. LOCALSCHED,(< T,a>,b), i=1,2) and recognize on-line 
all histories in C with <7a> as | prefix, without executing more than b send 
instructions in the worst case. The algorithm is written in a programming system with 
the capabilities outlined in Section 2.1. 


The LOCALSCHED processes (see Fig. 3.2 for i=1) communicate with 
transactions and with each other using messages. The messages received by a process 
are buffered in a FIFO queue. The variables that the scheduler processes use for — 
recording the state of the system are the state variables s,, 1;, t;, p;, and b. The. 
variables m, (modes) are used to synchronize the-two processes, so that when one 
process asks the other a question it expects an answer before examining other 
requests. The execution of send instructions is controled by the conditions of 
Theorem 3, The procedures Grant; grant requests, Finally. the procedures. Delay, 
Delay*, handle the cases where the input is discovered to be incorrect. Let us explain 
the above features in some detail. 
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LOCALSCHED ,(<7,a>,b) . 
1. s):=<Tad; 1y:=<Tad; ty: =; pyi=G; my:=normal; 
2. when queue nonempty do - 
3. if my=normal | 
then M:=first message of queue (delete it from queue); 
else wait (for message of type Q or A); M:=first such message; 
4. (Based on M assign) 
s}:=(state of 1); 
r):=(state of 1 that is also known by 2); 
p):=(set of pending requests, at most one per site); 
ty: =(state of 1 resulting if pending requests were granted); 
5. (Respond to message M) do one of three cases (R,A,Q); 
6. od end 


case R: 
if t}€ PR(C) then Delay,(p)) else 
if 3p st. {t)= B/ryt A {g¢PR(C)} A {(8/t)),€PR(C)} A {t,€M,(b-2)} 
then m)=wals, send <2,Q,p1,sp; 
else S}: =f) Grant, (pj,5)); 


case A: ; 
if P) is in Ss] then Grant, (p),S;); LOCALSCHED ,(s;,b-2); else Delay*;(p}.5}); 


case Q;: . 

if qe PR(C) then $:=t 

if m,=normal then send <2,A,2,S)>; 

if t}€ PR(C) then Grant,(p,,s,); LOCALSCHED,(s,,b-2); else Delay*,(p;,s)); 


Figure 3.2 LOCALSCHED at 1° 
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‘a) Messages: The messages received by the scheduler process at site 1 (for those 
received at site 2 interchange 1 and 2) have the following format, (i.e. there are three 
types of messages): <1, type, requested action, state at site 2>. 


R (for type=request). This is a message from a transaction to the 
scheduler process at site 1. It contains a request for an action p at site 1. State 
information about site 2 is included (else it is @), when data from site 2 is 
necessary to compute p. This happens when an ancestor of p, in the 
transaction of p, has been executed at site 2. Then the transaction defined 
message can be used to transmit information about the state at 2. Examples of . 
such messages are <1,R,p,s)> or <1,R,p,@>. 


Q (for type= question), This is a message from the scheduler process at 
site 2. This process needs site 1’s approval in order to decide whether to grant 
some request p, when it is at state s,. An example for such a message is 


<1,Q.p,5>>. 


A (for type=answer). This is a message from the scheduler process at site 
2 answering a type Q message of.the process at site 1. Site 2, having full 
knowledge of the system, determined whether the pending request at site 1 
should be granted. All necessary information has been incorporated in the 
State at 2. An example for such a message is <1,A,9,5)>. 


(b) State tate: The state of each LOCALSCHED, (s,) is the prefix in PR(C), that the 
process at site i knows has been executed. For ciarple with C=SR the state is G(7) 
(see Definition 9 Section 2. 1), with a partial assignment of directions that can be 
realized by a prefix. For this case correctness is guaranteed if acyclicity is maintained 
in the directed part of the conflict graph. In addition to s,s LOCALSCHED; keeps an 
estimate of the state of the other scheduler process 1;. With this estimate it keeps 
track of the part of s, that the other site might not have heard of. Every time a 
message is received or a request is granted s, and r, are consistently updated. Finally — 
p; is used to store pending requests and t; the state that would result if these requests 
were granted. The variable b keeps count of site i’s estimate of the number of send 
instructions executed or the number of messages of types Q and A. 


c) Synchronization of the scheduler processes: The modes (m,) are binary 
variables used by the scheduler processes to guarantee that every question is 
answered. A mode is cither normal, indicating that new requests are processed, or 
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wait, indicating that the process at i needs an answer in order to decide on pending 
requests and handles no requests until it receives one. As can be seen from Fig. 3.3 a 
type A message is never received when the mode is normal. The two sites never 
deadlock (wait for each other indefinitely), because of the effect of A and Q type 
messages on the mode. 


(d) Communication Protocol: Every incoming request is examined (if the mode 
is normal) and if it renders the local state incorrect it is delayed. If its execution leads 
to a correct local state (t,) we determine whether send instructions should be 
executed. We first examine whether it is possible for a malicious adversary to give as 
input to the other site requests, also leading to a correct local state for the other site, 

- but such that the total input is incorrect. If this is not possible the request is granted. 
If, on the other hand, this is possible some strategy has to be worked out for 
communication. In that case we also test whether t¢ M,(b-2). If this is not so the 
request is granted without informing the other site. If this is so, site i sends a Q 
message in order to ask for the other site’s permission to proceed. If it receives a go- 
ahead then we notice that, after sending two messages, both local processes are in 
fact LOCALSCHED(SpewPpew) with common new state and new message 
parameters. This makes it possible to give an inductive proof of correctness. 


Three decision questions are ‘actually answered: 
{ie PROF 
{does a “bad” B exist with t;=(projection of f at i given aH 
{t,€ M{b-2)}? 


(e) Granting requests: When LOCALSCHED, decides to grant a request it 
allows the transaction to update the variable of the requested action. Also if this 
transaction will send a message to some other site it will incorporate in that message 
the local state s;. AlJ this is achieved using Grant;(p,,s,) (i.¢., if p; contains a request 
for an action at site i, then let the transaction of this action perform its update and 
use S; in any messages it sends to the other site, else no operation). 


(f) Delaying requests: If a request is received when the mode is wait the request 
remains in the queue and will eventually be processed in its order of arrival. It is 
delayed at most by the communication delay of a Q and an A message. If on the 
other hand the scheduler discovers that the pending requests (at most one at each 
site) would lead to an incorrect execution then it delays one pending request. There 
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are two cases: 


For only one site t¢ PR(C). Then the process at i delays the pending 
request at i by putting it at the end of its queue (busy waiting). The scheduler 
continues functioning as.if the input were correct. This is accomplished using 
Delay;(p;) (i.e., if p; contains a pending request at i, then put it at end of 1's 
queue, else no operation). 


Both sites discover that t¢ PR(C). This happens through an exchange of 
a Q and an A message (one pending request at the site that sent the Q 
message) or of two Q messages (one pending request at each site). In this case . 
Delay*,(p;.s,) is used. One pending request is delayed. If there are two 
pending requests the younger one is delayed and the older one granted. Since 
consistent timestamps [19] can always be: assigned to events in a distributed 
system, there is no problem in determining the younger of the two pending 
requests. Both processes of the scheduler know that the input is incorrect and 
that a conimon prefix s* has been executed. In this case no more send 
instructions have to be executed to realize C (see Def. 12 Section 2.1), because 
a predetermined correct completion of s* can be executed. — 


LA,-> ov <1,9,--D> 


<49,- fh, R ya > a 


- Figure 3.3 The mode at 1 
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Let invien(<T,a>)={number of actions in 7}-{number of actions in a} 


For the conditions (I) and (II) of Theorem 3 we have proven that (I) implies (ID). 
We will prove that (II) implies (I) by induction on invlen. 


Induction hypothesis: For invlen(<T,a>)=j we have that, if (II) is true then (I) is 
true and moreover after <7,a> has been executed LOCALSCHED,(<7,«>,b), i=1,2 
realizes C and sends at most b messages. 


For j=0 this is trivially true since <7,a> is a history in C, there are no more 
requests left and <T,a>€ A1(0)C M,{b). So we assume the hypothesis is true for j<j* 
and (II) is true for <7,a> with invien j* and some b (that depends on <7,«>). Since 
we have to prove (I), we have to exhibit a realization of C that after<T,a> sends b or 
fewer messages. We consider the scheduler Q that realizes C by submitting all 
- requests to one site, except when the input prefix <7,a> has been executed. From 
that moment on Q uses LOCALSCHED{<T,«>,b), i=1,2. We need only consider the 
operation of the scheduler after <J,a>. There are two cases: 


Case A: he C. First we will examine the case where no send instructions are 
executed and then the case where some are executed. 


A.1: No send instructions are executed. Then the output has to be h and 

no request p waits for the execution of a request which is not an ancestor of p 

_ in h (Def. 12 is satisfied). This is because on every request p the test (Is new 
state in PR(C)?) is always true and involves only local computation. The 
reason for this is that by definition of C, as a concurrency contro! principle, h. 
has no crossedges that are not forced by the transactions. Thus the part of the 
input each scheduler sees is automatically a prefix of h. Therefore it is 
unnecessary to wait for a message from the other site to verify that what the 
local scheduler sees is indeed a prefix of PR(C). Finally note that b>0. 


A.2: Two or more send instructions are executed (the first two resulting in 
an exchange of a Q and an A message or two Q messages). Up to the first 
exchange the previous arguments, of A.1, hold. In order to execute send 
instructions a prefix @ must exist that  satisfics the conditions (1),(2) of 
Theorem 3 and has the new state t; of a scheduler process as a projection. 
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Also t; must be in M/,(b-2), which can be decided since invien(t)<j* (by the 
induction hypothesis and the “only if". part of the proof). Finally since 
M{b-2) is not empty b>2. After the exchange LOCALSCHED((S,¢y:0-2) 
i=1,2 is used and we can invoke the induction hypothesis since 
invien(Srewj*. So h is outputed on-line with at most 2+(b-2) send 
instructions after <7,a>. 


Case B: h¢ C. First we will prove that the output of the scheduler Q is a history 
in C (B.1). Finally that no more than’ b send instructions are executed (B.2). 


B.1: Let the output (the granted requests) be a history h* not in C. Then 
it has (perhaps more than one) prefixes, called y, such that y¢ PR(C), y has 
<T,a> as prefix and y is minimal (all its prefixes are in PR(C)). Let us call qy 
and q, the final actions of y, not in <Z,«>, which are at sites 1 and 2 
respectively. At least one of them must exist. Without loss of generality let site 
1 grant q, before site 2 grants q» (if y has a qo). Since y is minimal we have 
that either qj does not exist, or q, is an ancestor of qy in h* or qy and 4» are 
concurrent in h* and then y is an example ofa g prefix of Theorem 3. If q) 
does not exist then, when the process at site 1 receives q) it cannot grant it, 
because it sees from the information available to it that the result would be 
incorrect. If q, is an ancestor of qy in h*, (that is there is a transaction 
crossedge making q, an ancestor of @ in h*) then site 2 knows q, has been 
executed (through a transaction defined message) and delays q. Finally if y 
is an example of a f prefix of Theorem 3, then some exchange of two 

_ messages has to take place before q, and q) are granted. By (If) one of the 

_ projections of y isin M,{b-2), b>2, and thus, before both requests are ganted, 
one of the processes sends a Q message. If this exhange results in 
LOCALSCHED(S,,ew:b- 2) i=1,2. being initiated we.can use induction to 

argue that y-cannot have been executed. If the exchange results in Delay*, i 
i=1,2 being called, both processes output a correct predetermined completion 

_ of a common state s*. Thus we conclude that y cannot have been executed 
and the output of the scheduler is always in C. 


B.2: Since b>0, if no send instructions are executed we have no problem. 
If send instructions are executed, let us look at the first round of 
communication (two Q messages or one Q and one A’ message). If as a result. 
of this exchange LOCALSCHED(s,,,,b-2) i=1,2 is initiated with s,,€. 


new? 
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M {b-2), we know that invien(s,y)<j* and b>2 (See A.2). By induction no 
more than b-2 send instructions are used after this and again our goals are 
met. If as a result of this exchange Delay*; i=1,2 is initiated at both sites 
(which is possible since the input h¢€ C), then we know that b>2. This is 
because (IT) holds and a “bad” £ exists. After both sites call Delay*; they have . 
a common state s* and use no more send instructions, because the completion 
of s* is predetermined and can be recognized locally. Thus no more than b 
send instructions are ever executed. 


This completes the proof of Theorem 3.0 — 


Corollary 3.1: If-C, a concurrency control principle, has a computationally 
efficient realization, then it has a communication-optimal realization, which can be 
implemented in space polynomial in n (n=number of actions of 7). 


Proof: It follows from Theorem 1 that, since C has a computationally efficient 
realization, recognizing if a prefix is in PR(C) can be done in enn time in n. 
Consider the following realization Q: 


Q: (1) Each site computes b* from 7, where b* = b*(= min{b/< T, O>€ MAb) 
(2) Site i uses LOCALSCHED(<T,2>,b*) (i=1,2) 


By the constructive proof ‘of Theorem 3 Q is a realization of C. using the - 
minimum (b*) number of messages. From this proof we have that four 
computational tasks are performed by LOCALSCHED. These are: 


(a) Given t, does t€ PR(C)? 
This can be performed in : polynomial time (and therefore space). 


(b) Given t€ PR(C), ixj, and rst, is there a B such that: 
{t=(G@/p} A {e€ PRC} A {9/1 PR(C)}? 


This can be performed in nondeterministic polynomial time (and therefore ‘space). 


(c) Given t€ PR(C), b>0, does t€ M,(b-2)? 
Using Theorem 3, the polynomial characterization of PR(C) and the theory. of 
alternation [5], we have that bow this task and step (1) of Q can be implemented | in 
poly nomial space. 
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(d) Finally if Q discovers that the input is incorrect and Delay*; is called at both 
sites then a correct completion of s* can be efficiently computed. This can be done 
based on a predetermined ordering of the actions and the test of membership in 
PR(C). : 


This completes the proof of the existence of a communication-optimal scheduler 
realizing C in polynomial space and exponential time.G 


We will end this section with some comments on Theorem 3. 


(1) Message lengths: Let us examine the length in bits of the messages sent. If 
|7|=n there are at most n! states and in order to uniquely code a state we need 
O(nlogn) bits. Also we never send more than 2n messages. In the proof of Theorem 3 
we have used an inefficient format for messages <...,s,>. Although for clarity of 
presentation we used s, (O(nlogri) bits) in our messages, we could have as well used 
s;\r; (i.€., each site will hear of every action at most ence). Thus in total noe) bits 

will be used in the worst case. 


(2) More than two sites: The two site case, while being the simplest distributed 
configuration is sufficient for the results of Chapter 4. If more sites are used and the 
mode of communication is a broadcast mode, Theorem 3 can be easily generalized. 
On the other hand a network of sites makes optimal communication a more difficult 
problem, since it implicitly adds the problem of appropriately routing the messages. 


(3) Persistency: We have examined schedulers that realize C and consist of two 
processes, one.at each site. Each of these processes knows of some pending requests 
and a prefix of a history in C that has been executed (its state). 

We call such realizations of C persistent if whenever a process i discovers that 
the execution of a pending request p; would make its state s; incorrect, it delays p; 
indefinitely and proceeds as if only the requests in s; had been submitted. 

If PR(C)e€ P there are persistent polynomial time schedulers realizing C, as is 
obvious from the proof of Theorem 1 and [25]: On the other hand the scheduler of 
Corollary 3.1 is not persistent. For some incorrect inputs Delay* is used. This is 
because persistency requires that messages are sent even after the input is discovered 
to be incorrect. To illustrate this suppose our scheduler starts with <7,a>€ M/,(b) and 
receives a “bad” input <7,B> with projections <T,a,>€ Af,(b-2) and <T,a2>¢M (b-2). 
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It is possible for <T,a> to have been executed when the scheduler, at the expense of 
two messages, discovers the input to be incorrect. If we want our scheduler to be 
persistent, starting from <7,a,> it has to use more than b-2 send instructions. 
This difference between on-line computationally efficient and on-line 
communication efficient algorithms, which accept the same strings, arises because of 
the nature of resources we are trying to optimize. In one case we wish to achieve 
performance C at asymptotic computation cost O(n*), in the other at fixed (say n/15 
or 200) communication cost. 
From the proof of Theorem 3 it is easy to see that: 
"<T,a>€ M,(b) iff there is a persistent realization of C, which if the input is in C 
sends at most b messages after <7,a>". 


We have related communication complexity of schedulers achieving parallelism 
C, with the computational problems <7,2>€ M,(b)? (which are in PSPACE). 


If the input history is in C and <7.a>€ M,{b) a user’s delay D is bounded by: 
b(communication delay/message) > D >0. : 
If the input history is not in C there is a user who has to wait for other users. 


The approach of Theorem 3 and the formulation of the scheduling problem are 
pretty much independent of concurrency contro! and serializability. The application 
to databases provides practical motivation and analytical tools (Le., mixed ordered 
multigraphs). In fact the entire methodology can be extended to distributed on-line 
computation of combinatorial functions of two integers, which in a csiribated 
environment: are stored at two different sites [38]. 
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3,2 Games related to Distributed On-line Computation 


In this section we will define decision problems for the sets of prefixes, which 
were recursively characterized in the previous section. - 


Distributed scheduling is related, using M,(b), to a game on prefixes, PREFIX, 
whose rules are displayed in Fig. 3.4. In this game Player I corresponds to a 
malicious adversary who wishes to force communication. His move is a “bad" 
continuation £ of the current position a. Player II corresponds to the two 
‘cooperating scheduler processes. Each one of his choices i* indicates, which of the 
two processes has the responsibility of guarding against the “bad” continuation f (by - 
questioning the other process before proceeding). Player I wants to prolong the game 
as much as possible, whereas Player II tries to bring it to an end as soon as possible 
(other than that there is no winner or looser). 


‘From Theorem 3 we can deduce the following property of communication- 
optimal realizations of C: 


_ Corollary 3.2: The minimum number of messages sent by a communication- 
optimal realization of C equals the length of PREFIX(<7,9>) if both players play 
optimally (we cail this the minimax length of PREFIX). — 


Proof: Follows from Theorem 3 and the theory of alternation [5].: Note that 
although in general we define PREFIX from an arbitrary initial position <7,ag>, we 
are in fact interested in ap=9. 'T represents the static (a-priori) information on 
transaction schemata, that is used to optimize communication. Thus {< T,a>€ M {b)?} 
is ‘equivalent to {Is the minimax length of PREFIX(<7, 0) greater than b 2. Oo 


In the following section we will analyze the game PREFIX for C=SR. If we 
choose serializability (SR) as our concurrency control principle the board position 
becomes the conflict graph G(7) with some of its edges directed. The moves of 
Player I become choices of directions to undirected edges of G(7). Much insight into 
PREFIX in this case is gained by studying a game played on mixed graphs called 
CONFLICT and displayed in Fig. 3.5. This game is our departure point in the — 
PSPACE- -Complcteness proof, given in the next section. 


PREFIX(K 7,0.) 

Initial position: For fixed C, a prefix <T.ap> 
Position before player I’s move: A prefix <T.a> 
Player I’s move: Select a prefix <T,p> such that 


(1) p is a continuation of a, with projections a;=(8/a), i=1,2 
(2) «},a are prefixes of C 
(3) B is not a prefix of C 


Player IT's move: Select i*€ {1,2} and set a:=ajs 


Players I and II take turns moving. Player IT always moves when I does. 
Player [P's goal is to prolong the game as much as possible. | 
Player IT's goal is to end. the game as soon as possible. 


° 


Figure 3.4 
- The game PREFIX 
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CONFLICT(G) 
_ Initial position: A mixed graph Go=(Vo,Ep,Ap) 
(Ep. partitioned into “red" and “green") 
‘Position before player I's move: A mixed graph G=(V,E,A) 
Player I’s move: Select an assignment of directions (Ay) to an XCE sach that 
(1) R(H) is the "red"("green”) subset of X 


(2) ApUA, AyUA have no directed cycles 
(3) AxUA has a directed cycle 


Player II's move: Select Y€ {R,H} and set E:=E\Y and A:=AUAy ~ 


Players I and II inke turns moving. Player II abways moves when I does. . 
Player I's goal is to prolong the game as much as possible. 
Player IT's goal is to end the game as soon as possible. 


Figure 3.5 
The game CONFLICT 


CONFLICT + (Gp) 
Initial position: An ordered mixed multigraph Gg=(Vo,Ep,Ap,{>;}) 
(Ep partitioned into "red" and “green") 
Position before player I’s move: An ordered mixed multigraph G=(V,E,A,{>;}) | 
Player I's move: Select a closed assignment (Ay) to an XCE such that 
(I) Ay has projections Ay',A,% 
(2) AxTUA, Ay8UA have no directed cycles 
(3) AxVA has a directed cycle 


Player II’s move: Select y€ {t,g} and set E:=E\(edges in Ay’) and A:=AUA,! 


Players I and II take turns moving. Player II always moves when I does. 
Player I’s goal is to prolong the game as much as possible. 
Player IT's goal is to end the game as soon as possible. 


- Figure 3.6 
The game CONFLICT t 
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The game CONFLICT abstracts, in the legal moves of Player I, only the rules of | 
PREFIX derived from an unordered conflict graph (g has to create a cycle in the 
conflict graph, while (6/a); i=1,2 should not). In fact the assignments of directions to. 
edges of G(7) in PREFIX should also correspond to prefixes f and (8/a); i= 1,2 (see 
Lemma 1, Section 2.2). CONFLICT can obviously be played on multigraphs with no 
modifications of its rules. 


We will now generalize the game CONFLICT to CONFLICT + (see Fig. 3.6), 
where in addition to the rules of CONFLICT a precedence rule is observed. 


The input to the new game CONFLICT+(G) is an ordered mixed multigraph 
G=(V,E,A,{>;}). (V) is the vertex set, (E) is the multiset of undirected edges 
partitioned into “red” and “green”, (A) is the multiset of directed edges and {>;} are 
partial orders (e.g. all ‘undirected edges incident at node i form a partial order >;). 
All conflict graphs (see Def. 9, Section 2.1) are such constructs. If A#@ some 
conflicts have been resolved and the >;'s correspond to transaction partial orders. 


Definition 15: Given an-ordered mixed multigraph G=(V,E,A,{>;}), and an 
assignment (Ax) of directions to a multiset of edges XCE, we call this assignment 
closed (with respect to G) when: . | 

If ij€ X and is directed from i to j and ik >; ij then ike X.O 


| Given a conflict graph G(7) and an assignment of directions to some of its edges 
(Ax), that has no directed cycles, then Ay is realizable by a prefix in SR iff it is 
closed. This follows easily from Theorem 2 and Lemma 1 (see Section 2.2). 


Let the undirected edges of G be partitioned into "red" and "green", and let Ay 
be a closed assignment of directions to XCE. It is easy to see that the following . 
closed assignments are uniquely determined. They are called the projections of Ax. 


Ax! (where i="red" or "green”): 
(1) Ayic Ay 
(2) Ay! is closed 
(3) all i edges of X are given directions in Ay! 
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If {>;} become the empty partial orders for every node, CONFLICT+t becomes 
CONFLICT (i.c., X=RUH, Ay’ = Ap, Ay8 = Ay). The real interest of 
CONFLICT? is its relation to PREFIX. A prefix <7J;a> in PR(C) determines a 
unique mixed ordered multigraph G%(7) (see Def. 10, Section 2.1). In the next 
section (Lemma 2, Section 4.1) we will show that for C=SR, PREFIX(X7.a>) and 
CONFLICT + (G%(7)) are equivalent. An example of CONFLICT, where an optimal 
game leads to four moves is presented in Fig. 3.7. 


(a). 


: a 


(€) 


Figure 3.7 
(a) G(7) initial position (—— "red", -~- "green”) 
(b) I's first move . 
(c) II chooses “red” 
(d) I’s second move 
(e) II chooses “red” 
(f) no legal moves for I 
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We will close this section with a brief discussion of an important special case of 
the question {<J,a>€ M,{b)?} namely b=0. This problem is obviously in NP, . 
because all we have to do is guess a prefix satisfying conditions (1),(2) of Theorem 3 
and check these conditions in polynomial time. . 


In the next section (Corollary 4.2, Section 4.1) we will prove that 
{<T,a>€M{0)?} is NP-Complete. This leaves us with the problem {<7.0>€ M,(0)?}. 
We say that the conflict graph G(7) of a transaction system 7 contains a mixed cycle, 
if it contains a cycle with edges e, and e, where e, corresponds to a conflict at site 1 
("red") and e, to a conflict at site 2 (“green”). | 


Corollary 3.3: For C=SR, if G(Z) contains no mixed cycle then <7,9>€ M,(0). 
This is also a necessary condition, whenever the transactions in 7 have no 
crossedges. 


Proof: The sufficiency is obvious from the characterization of SR and conditions 
(1),(2) of Theorem 3. The necessity for transactions with special structure is easy for 
two transaction systems. For more transactions we can use a straightforward 
induction on the number of transactions (nodes of G(7)).O 


For general transaction systems T and C=SR, the complexity of determining if 
{<7,9>€ M{0)7} is an interesting open question. For example all systems in Fig. 3.8 
are in M,(0), yet their conflict graphs contain mixed cycles. 
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4. The Complexity of PREFIX 


' This chapter contains our main result, which is an analysis of the game PREFIX 
for C=SR. 


4.1 PREFIX is PSPACE-Complete 


We will now prove the following theorem: 


Theorem 4: Let C=SR. For input T and b>0, determining whether the 
minimax length of PREFIX(<7,@>) is greater than b is PSPACE-Complete. 


_ All the games we will examine in this section are in PSPACE. This follows 
easily from the analysis in Chapter 3. Therefore we will present only the reduction of 
a well known PSPACE-Complete problem to PREFIX. This is the problem QBF 
(i.e., what is the truth value of a quantified boolean formula){11,33,34]. 


OBF: 

Input:. A quantified boolean formula I, of the form: 

AX] VX73K3---3Xq-1VXq_ F(Xp,Xp,--.%q) 

where F is a boolean formula without quantifiers in 3CNF (3-conjunctive 
normal form) of the variables x,,....x, (m=even). 


Question: Is I, true? 


QBF can be viewed as a game between two players, the 3-player and the v- 
player. These players take turns assigning values to the variables in the order these 
variables are quantified in I, (i.e., from left to right). First the 3-player assigns a 
value to x,, then the v-player assigns a value to x, etc. The 3-player wins if the 
values assigned to the x,'s i=1,....n make F(x),X»,....X,) true, otherwise he looses. The 
a-player has a winning strategy iff 1, is true. This PSPACE-Complete problem is 
used in most reductions to games, [5,11,33,34,8,29]. 


55 


Another game on boolean formulas used in our proof is AE-QBF. This is similar 
to QBF only the v-player makes all his moves before the 3-player. 


AE-QBF: : 

Input: A quantified boolean formula I, of the form: 
VX7WX4...WX 9X7 3X3..9Xp-7 F(X7,X9).-Xq) = 

where F is a boolean formula without. quantifiers in 3CNF (3-conjunctive 
normal form) of the variables xj,....x, (m=even). 


_ Question: Is I, true? 


| AE-QBF is I1?-Complete, where 11? is a class of the polynomial time hierarchy 
[33,11] corresponding to one v3 alternation (see Fig. 4.1). 


| sp=np=sf=P 
‘for all k>0 
by, P= PE 

Z_ P= NPH 


Ty, P= COX, ? 


Figure 4.1 
| | The polynomial time hierarchy 
PY—{L: there is a language L’€ Y st. L is. P- time Turing reducible to L’} - 
NPY ={L: there is a language L’€ Y st. L is NP- time Turing reducible to L’} | 
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Our reduction of QBF to PREFIX will proceed in four parts, which we outline 
below from (I) to (IV). 


(I) We show that. CONFLICT, as defined in Fig. 3.5, is ,P-hard . We 
accomplish this by reducing AE-QBF to CONFLICT in Lemma 1. The input graphs 
to CONFLICT are mixed (i.e. they may contain directed edges). 


(II) We generalized CONFLICT to the game CONFLICT*T, that has in 
addition to the rules of CONFLICT a partial order on edges incident at a node. The 
definition is such that all possible conflict graphs G(7) can be inputs to 
CONFLICTt. In Lemma 2 we prove that the game PREFIX (for C=SR) is a 

special case of CONFLICTT. 


(II) We prove that CONFLICT+ is PSPACE-Complete, even when the input 
is a graph without directed edges. We accomplish this in Lemma 3 using many of the _ 
constructs of Lemma 1, 


(IV) Finally we prove that PREFIX(<7.9>) is PSPACE-Complete by showing 
that the graphs in Lemma 3 are in fact conflict graphs for some transaction system. 


In Lemma 1 we will examine the game of CONFLICT (see Fig. 3.5). Its input is 

a mixed graph G=(V,E,A), where E is partitioned into "red" and “green”. Player I 

picks an assignment of directions for a "red" subset of E{(Ap) and for a “green” 

subset of E(Aj;). The choices he makes must be legal (ie. AUApAUAy have no 

directed cycles, AUApUAy,, has a directed cycle). Player II chooses “red"("green") 

making the new directed board position AUAp(AUAy) from A. Player I wants to 
make the game last and Player II wants to terminate it. 


The direction of an undirected edge e can become fixed during the game in two 
ways. First if Player I chooses e as part of Ap(Aj) and Player II chooses 
"red"("green"). After this e becomes part of A, the directed section of the board 
position, On the other hand, even if e has not become part of the directed (A) before 
Player I makes his new move, it is possible for A to contain a directed path between 
the endpoints ‘of e. Now e’s direction is fixed; because it can only be used in one 
direction, if Player I's moves are to be legal. It is easy to see that if a move by Player 
I is legal Ap(Aj;) must contain edges, whose directions have not been fixed. Because . 
of this observation the following fact is easily seen to be true. 
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(0) If G has z "green" edges CONF LICT(G) lasts at most 2z moves. If Player I 
makes a move with two "green" edges, whose directions have not been fixed, a move 
of "green" by Player II would consume two "green" edges. Moreover if Player I 
makes a move with exactly one "green” edge (e), whose direction has not been fixed, 
then no matter what the response of Player II is e’s direction becomes fixed 
(i.e., either e becomes part of the new A or a path is incuded in the new A 
connecting the endpoints of e). 


We will use the notation MN for ari undirected edge and (MN) for a directed 
edge from M to N. Similarly MM. Me will be an undirected and (M,M).. Mo a 
directed path from M, to My. ) 
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Lemma 1: Given a mixed graph G and a nonnegative integer b, determining 
whether the minimax length of CONFLICT(G) is greater than b is 1,P-hard. 


Proof: For an arbitrary instance I,, of AE-QBF we construct the mixed graph 
G(I,,) using the rules (a) to (d) below. We will prove that I, is true iff the game 
CONFLICT can last more than n moves on G(I,). 


(a) For every existentially quantified variable x;, i=1,3,....n-1 in I, a copy of the 
graph in Fig. 4.2(c) is included as a subgraph of G(I,). This subgraph contains 6 
directed edges and 2 "red" undirected edges, namely T;D, (labelled with 1) and FjE, 
(labelled with 0). Actually this is the graph of Fig. 4. 2b) without nodes A;,B;,M;,N;. 
These are the a-subgraphs, | 


(b) For every universally quantified variable x;, i=2,4,...n in I, a copy of the 
graph in Fig: 4.2(a) is included as a subgraph of G(I,). This subgraph contains 6 
directed edges, 1 “red” undirected edge T,D, (labelled with 1) and 1 "green" 
undirected edge FE; (labelled with 0). These are the v-subgraphs. 


(c) For every clause of the 3CNF formula of I, (i.e. F(x},X3,....X,)) a copy of the 
graph in Fig. 4.3 is included as a subgraph of G(I,). This subgraph contains 35 
directed edges and 21 “red” undirected labelled edges. For the kth clause (uVvVw), 
‘(starting from left to right in F(x),x9.....X,)), which has literals u,v,w, we have seven 
possible paths from C, to CQ,,,. Each one of these paths corresponds to an 
assignment of values to the ‘literals u,v,w, of the clause, which makes the clause true 
(i.e. only assignment 000 is excluded). The assignment can be read from the labels of 
“red” edges on the path. Every one of the three columns, of seven labels each, 
corresponds to the possible values of one literal. Also for one literal (say u) four 
- directed edges go to F, and three to T,,, depending on the label of the “red” edge 

from which the directed edge starts. We call these directed edges (to F,, or T,) 

backedges. We use the following rule: 

u=Xj => F,, =F; and Ty=T;. 

u=—x; = F,=T; and T,=F, for x; a variable of I, 


The backedges are connected so that if the labels correspond to values of 
variables and literals a backedge connects two undirected edges iff their labels are 
inconsistent (e.g. x}=1, u=—X}, a backedge connects T,D, and “red” edges with 
labels 1 in the column of u). These are the clause-subgraphs. 
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(d) The graph G(I,) is constructed by identifying nodes with the same name. 


_ That is 5p of 3-subgraphs with Sq of v-subgraphs if p=q. Also F,’s or T,’s of 3-. 


and v-subgraphs are identified with Fys or T,’s -of clause-subgraphs if p=q. We 
also identify C)=S,,,3. If there are m clauses in I, we add the “green” edge 


S1Cn+1- 


An example is provided in Fig. 4.4 for the AE-QBF: 
I n= VX2WX43X14X3 (x) VX2VX3)A(K2VX4V 7X3), if we. delete the nodes A,B; M;.N;, 
i= 13 and AisApAse- We will first examine some simple properties of G(1,). 


(1) Let G(I,) contain z "green" edges. Then CONFLICT(G(I,)) can last 22-2 
moves and at most 2z moves, Here z=n/2+1. The game can last 2z moves, because 
of observation (0) (right before Lemma 1). It can last always 2z-2 moves, because — 
Player I can play z-1 times on the z-1= n/2 mixed cycles (F, iET)DF). i=2,4.,...,n. His 
moves are legal no matter what the response of Player I is. 


(2) Let (S;..C,,,3) be any directed path from S, to C,,,), not using the 
“green” edge S)C,, , , and respecting the directed edges in G(I,). We note that each 
pair FE, T;D;, i=1,2.....n, forms a cutset separating S, and Cm aS (Spe Cn+V 
contains FE, or T;D; for all i=1,2,...n. fo 


(3) All paths (S}-Cn 4-1) have to contain node C). If they contain a backedge it 


- is easy to see that they have to pass through C, at least: two times. Therefore simple 


paths (containing a. node only once) (S Cm p do not contain ee 
-Let_us proceed. with the proof of equivalence: 


“only if” If I, is true then Player } first makes n/2 moves on the v-subgraphs ; 


~ using the mixed cycles (F,E,T;D,F;), i=2,4,...n. The n/2 moves of Player II fix 


directions for all the undirected edges F}E,, T;D, i=2,4,....n. His choice of “red” turns 


__'T,D, into (T)D) and fixes the direction of FE, to (EF), (because of the directed 
path (E,T;D,F;), which now becomes part of A). This correspands:to assigning x; the . — 


value 1. Similarly his choice of “green”. turns F,E, into (FE) and fixes the direction 
of T;D,; to (D,T;). This corresponds to assigning. x; the value 0. 


At this point in CONFLICT 22-2 moves have been made and we can say that — 
the choices of Player II have assigned values x*; to the variables x;, i=2,4,...,n. Since 
I, is true there exist values x*; of the variables x; i=1,3,...n-1, which make 
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F(x*},x*5,...X*,) true. This assignment of values {x*} to variables {x} implies an 
_assignment. of values to the literals of every clause {u(x*)} (ce, y=, x*=] 
implies u*=0). 


Let us describe the n/2+I1st move of Plater I. Consider the simple path 
(S)..C,,4.)*, which consists of the following subpaths in the various subgraphs of 
G(I,). | | 

(S,T,D,S, 4.1) ifx*,=1, k=1,2,3,....n. In ¥-subgraphs the direction of T,D, has 

been fixed to (T,D,). In 3-subgraphs (T,D,) is used. | 

(S, FES, 43) if x*,=1, k=1,2,3,....n. In v-subgraphs-the direction of F,E, has 

been fixed to (F,E,). In 3-subgraphs (F,E,) is used. 

In the kth clause-subgraph the path from C, to C,,,, whose labels are the 

values assigned to the literals of the clause by {x*}. Such a path exists since no 

clause is assigned the values 000 by {x*}. 


We note that, because of the way (S)..C,,,))* traverses v-3- and clause- 
subgraphs, the directed edges of G(I,) and (S}...C,,,.3)* form no directed cycle. Note 
that no backedge has both its endpoints on (S)...C,,,3)*, because the labels in the 
various subgraphs along (S}..C,4.1)* are consistent. 


Using the rules of Fig. 3.5 Player I picks: 

A "green" set H={S)C,,,,} and directs it (Ay) from C,,,) to 5). 

A "red" set R={"red” edges in (S)..C,, +)"} and directs them (Ap) along the 
path (SC 41)". 


This is a legal move since: ApUA, AyUA are acyclic, ARUA,UA is not. 
Therefore if I, is true CONFLICT can last n+2 moves. 


"if" If I, is false we will prove that CONFLICT (G(I,,)) cannot last n+2 moves. 
We will assume CONFLICT(G(I,)) can last n+2 moves and reach a contradiction. 


The move of Player I, which has “green” edge S,C,, , }€ H must be his n/2+ Ist 
move. This is because, if the direction of some “green” edge has not been fixed yet, 
any simple path (S)..C,,, 3) that Player I chooses would make it possible to fix the 
directions for two "green" edges. This follows from property (2) of such paths, 
proven above, and the structure of the v-subgraphs. Thus Player I must make n/2 
_ moves involving the "green" edges in the ¥-subgraphs first. Moreover any choice of 
Player II will fix their direction, (by observation (0)). We will prove that there is a. 
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sequence of choices by Player II that will not-let Player I move another time. 


Since I, is false then —I, is true or, 
_- AX) 3X4... IX _ VX] VX3...VXq-7 Ske Xn) 

Let the values of the x,’s, i=2,4,....n making this formula true be x*;,. For the first 
n/2 moves of Player I, each one necessarily involving a single F,E;, whose direction . 
has not been fixed, the response of Player II should be: 

If x*;=0 then "green". This fixes the directions of TjD, and F,E; into (D,T;) and | 
(F,E,) respectively. — 
If x*;=1 then "red". This fixes the direction F,E,; into (E,F)). 


The n/2+1 st move of Player I is now constrained in several ways in order to be 
legal. First for the "green" set we know Si\Cn¢1€ 1 because it-is the only “green” 
edge, whose direction has not been fixed. Second for the "red" set we know that 
{undirected “red” edges of a path (S)..C,,,))} R. Finally (S,..C,,.3) and the 
directed part of G(I,) must not contain a cyele. This path (S)...C,, ,}) must be simple 
(no backedges by property (3)) and thus pass through ali the subgraphs: 

In a clause-subgraph it has to use one of the seven paths. | 

In a V-subgraph its behavior is constrained by the way the directions of edges 

T;D;, F,E, (of which it contains exactly one) have been fixed. 

In a pepeaie it is constrained to contain exactly one of T;D,, F,E,. Else 

~ (8)...C,,4. 1) and the directed part of G(I,,) would contain a cycle. We extend the 
assignment {x*} in the following way for i=1,3....,n-1: 

If (Sy-Cy41) contains T;D; then x*;=1 else x*\=0. | 


Thus every candidate path (S)... Cr+) actually corresponds to an assignment 
{x*} of values to the variables and {u*} to the literals of F. This assignment can be — 
read from the labels of edges along the path. In fact {x*} and {u*} are inconsistent. 


By the way x*., i=2,4,...n were chosen every candidate assignment makes 
F(x*},....X*,) false. Thus a consistent assignment {u(x*)} to the literals must make the 
- literals in some clause (say the kth clause) 000. Our candidate path (S}..C,, +) uses 
a subpath (C,...C, 4) in that clause, which has a label 1 for one of its literals. 
Because of the initial connection of backedges, the backedge of that literal ends at a 
node that belongs to the path (S)..C,,,)) in a V- or a-subgraph. Therefore ARUA 
cannot be chosen to be acyclic and no candidate n/2+ 1st move of Player I can be 
legal. This is the desired contradiction. O 
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Figure 4.3 The kth clause subgraph 
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Figure 4.4 An example 


We will now examine the game CONFLICT+(G), which has as input an 
ordered mixed multigraph G=(V,E,A,{>;}). The edge multiset E is partitioned into 
"red" and “green”. The undirected edges incident at node i belong to the pattial 
order >;. The game is like CONFLICT((V,E,A)) the only difference is that 
assignments Ay (corresponding to ApUAj,), Ay’ (containing all selected "red" edges 
and corresponding to Ap) and Ay% (containing all selected “green” edges and 
corresponding to A,,) must be closed. That is: 
if Gj)€ Ay and ik >; ij then (ik) or (kie Ay (unless of course ik already is in A). All 
this is described exactly in Definition 15 and Fig. 3.6 of Section 3.2. 


As indicated in the previous section CONFLICT (see Fig. 3.5) is a special case 
of the game CONFLICT + (see Fig. 3.6), which is important because of its relation 
to PREFiX (-e Fig. 3.4). The inputs of CONFLICTT. are slightly more general 
constructs, (ie. ordered mixed. multigraphs), instead of mixed graphs, They are 
motivated from conflict graphs and realizable assignments of directions to their 


edges. 


From Definition 10 Section 2.1 and Lemma 1 section 2.2, we have that a prefix 
<Ta> uniquely determines an ordered mixed multigraph. This is because, given 
<T,a> we can construct G*(7)=(V,E,A,{>;}), which fs the conflict graph (G(2), 
with some conflicts resolved (A), some conflicts unresolved (E), and the transaction 
orderings. on the unresolved conflicts. The assignment of directions A is closed (with 
respect to the conflict graph G(N) and moreover if C= SR it has no directed cycles - 
ce Theorem 2). 
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Lemma 2: Given a prefix < T. a> in PR(SR) and a nonnegative integer b, then 


the minimax length of PREFIX(<7,«>) equals the minimax length of 
CONFLICT + (G7). 


Proof: Let us recall the following facts: 


(a) An assignment of ‘directions (Az) to undirected edges (Z) of the — 
conflict graph G(7) is realizable by a prefix iff: 


(i) Az is closed (with respect to G(7)) 
(ii) Az has no directed cycle (ijiz.-igiy) S.L: iin 212 iniz,..-rigly 2 il iyi. 


(b) Consider two realizable assignments A,A’ of directions to edges of a 
conflict graph G(7) and let <7,a> be a prefix realizing A. It is easy to see that 
if AGA’ then -A’\A is closed (with respect to G%(7)). 


(c) Also recall that continuations <7,g> of <J,a> in PREFIX, with 
projections «; i=1,2 have properties: 
<T,a> realizes A, A has no cycles 
<T£>€ PR(SR), <7,f> realizes A’, A’ has a cycle 
<T,a>€ PR(SR), <T,a;> realizes Aj, A; has np cycle i=1,2. 
We have that, A’\A, A;\A, Aj\A are closed (with respect to G®(7)). 
Moreover if 1 is the "red" site and 2 the “green” site and Ax=A’NA then we 
have Ax'= =A,\A, Ay’= A,\A. 


To prove the lemma we use induction on j, where j=|actions in 7 and not in al. 
For j=0 and any b the lemma is true, since no move is possible (all conflicts are 
~ resolved). We will assume the lemma is true for all b and all j, 0<j<j*-1 and prove it 
true for j*. For.every move in one game we will exhibit a move in the other, leading 
to assignments realizable only by stictly larger prefixes. | 


“only if” from the discussion above a move in PREFIX corresponds to a move 
in CONFLICT*+ and no matter what the choice of Player II is the resulting 
assignment of directions to the conflict graph G(7) is strictly larger than A and 
realizable. | 


“if” A move in CONFLICT? produces assignments Ax, Ay‘, Ax. Since these | 
are closed (with respect to G%(7)) and the existing directed part of the board A is 
closed (with respect to G(7)) we have that AyUA, Ay"UA, A,8UA are closed (with 
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respect to G(7)). 


We will show that A,UA is realizable by a <7,f>, which is a continuation of ~ 


<T,a>. Using Lemma 1, Section 2.2 all that remains to be proven is that A,UA has 
no directed cycles of form (ii) above. It is easily seen that such a cycle would be 
completely contained (because of the closure property) in either Ay"UA or Ay8UA. 
But since AyTUA, A,8UA, must be acyclic such a cycle cannot exist. Thus AyUA is 
Tealizable, in fact using the construction of Lemma 1, Section 2.2 we can choose 
_<T.B> to be a continuation of <T,a> . Then it is easy to verify that Ay'UA, Ay8UA 
are the assignments determined by the projections of <7,g> —_ are strictly larger — 
than A). 


‘Thus when CONFLICT+ has a move PREFIX has one also. 


We will now prove that CONFLICT *(G) is PSPACE-Complete, even if the - 
directed part of G is empty. 
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_ Lemma 3: Given an ordered graph G=(V,E,9,{>;}) and a nonnegative integer | 
b, determining whether the minimax length of CONFLICT + (G) is greater than b is 
PSPACE-Complete. 


Proof: For an arbitrary instance I, of QBF we can construct the mixed graph 
G'(I,) using the following subgraphs. 

(a) For x;, i=1,3,....n-1 3-subgraphs of Fig. 4.2(b). These are similar to those 
employed in Lemma 1, with additional nodes A;,B,M;,N; and their edges. 

(b) For x;, i=2,4,....n V-subgraphs as in (b) of Lemma 1. 

(c) For every clause in F(xj,....X,) clause-subgraphs as in (c) of Lemma 1. 

(d) The connections are as in (d) of Lemma 1, with the added edges: 
directed (A;Az 42). (Ag42Bi42) i=13,...0°3 | 
directed (AAgsD: (A; i+1Fj+) j=1)3,....n-1 
undirected "red" AB, i=1,3,...,0°3,. AF 41 j=13,...0-L 

An example is exhibited in Fig. 4.4. Using G’(I,,) we can construct the following 
ordered graph G(1,)=(V,E,0.{>,}). Assume I, has n variables and m clauses: 


V: The vertex set of G'(I,,) with an additional vertex for every directed edge in 
G’ (lL): which has K,=10n+35m-2 directed edges. |V|=18n-+64m-2. 


_E: These are the undirected edges of G’(I,), partitioned into “red” sate green” 

as in G’ (1) moreover we replace every directed edge (RQ) of G'(I,) (see Fig. 4.5(a)) . 

with a triangle (see Fig. 4.5(b)). Thus “G(I,,) has no directed edges. It is a graph 

partitioned into “red” and “green” and has 23n+n/2+91m-5 “red” and 1in+35m-1 
“green” edges. 


{>;}: To every ane incident at a node i we assign a number. We use the rules 
of Fig. 4.6 and Fig. 4.5(c). The ordering >; is the strict (no two different elements are 
equal) total ordering imposed by these numbers at i. 

For the kth triangle PQR 1<k<K,, which replaces a directed edge (RQ). we assign: 
at P PQ*1+kK,° PR*2+kK, 
at Q QP+1+kK, QR2+kK, 
at R RP©1+kK, RQe2+kK, 

For the undirected edges of G'(I,,) we use the aunnibers 1,2,3 as in Fig. 4.6. Note: 
at A; A,B >AiF,,j2A;B\,. i=1,3,...n-1 (the last for i#n-1) © 
at Fal . Fi Az Fe Bia i= 1,3,...,.n-1 
at B},5 Bi, ,A;>B;,5A; +2 i=1,3,...,n-3. 


Ath directed edge 1h, ence 


tok: ‘ 164-Kn 
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_ Figure 4.8 | 
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_ We will prove that [,, is true iff CONFLICT + (G(I,)) can last more than 2z-2 — 
moves, where z=11n+35m-1 (the number of "green" edges). 


“only if” Assume that I,, is true. We will describe a strategy that will enable 
Player I to make z moves (and the game to last 2z moves). 


First Player I plays on all the triangles, that we substituted for directed edges of 
G'(I,). At his kth move he plays i in the K,-k+1st triangle 1<k<K, (PQR in Fig. 
4.5). The first move is: 

Ax={(QP),PR)RQI 

AyT={PRRQ} 

Ax®={(QP)(RQ} - 

These are closed assignments (Def. 15, Section 3.2), with respect to the position of 
the board. Moreover if (A) are the directed edges on the board before the kth move 
A,UA has a cycle, while AyUA, Ay8UA do not. No matter what Player II’s choices _ 
will be RQ becomes the directed (RQ) in the new A. By induction Player I can play 
similarly on all triangles. Note that when Player I has played in all K,, triangles 
PQR, all (RQ)’s are in the directed part of the board and the directions of the other 
edges of the triangles have been fixed. Thus without loss of generality we can assume 
all directions on the triangles as being in A. and exclude nen from our further 
arguments about closed assignments. 


Now Player I will make n moves alternating between 3- and v-subgraphs (which 
correspond to the variables of I, x; i=1,....n), from the subgraph of x, to the 
subgraph for x,,. Recall that QBF I, can be viewed as the instance of a game between 
two players (the 3-player and the v-player), where the 3-player has a winning 
strategy. Player I wiil pattern his strategy on the winning strategy of the hae of 
the QBF game (for moves i+K,, l<i<n). 


_ The i+K, th move of Player I (1<i<n) is: 


ES If i=1,3,....n-1 oad the 3-player makes X,;=x*,=1 (based on the values x : 
that have been assigned for 1<j<i) then: . 
Ay={(T D,),(DM,),(B;A,), and (A;..B,) if D1} 
Ay'={(T;D), (DM), and (A;.,B;) if i>1} 
Ay®={(B;A;), and (A;.B,) if i>1} 
It is easy to check tht if the board position has directed edges A, these assignments 
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are closed. Also AyUA has cycle (T, ;D,M;B;A;T;) and Ay'UA, Ay8UA do not have 
any cycle (A;.,B;’s direction had been fixed to (A;.,B,) anyway). No matter what the 
response of Player II is to this move, the path (S,T;D/S; vp and the new directed part 
of the board form no directed cycles. 

If the a-player makes x;=0 we use the symmetric cycle (F,E,N;B,A;F)). 


(b) If i=2,4,...n then Player I uses cycle (T,D,F,E,T; 
Ax={(T,D)).(FiE).A\1F Dt 
Ax'={(TD)(Ai-.F)} 
AxS= {(FE) ALF) 
Again it is easy to see that the move is legal. But now Player II’s response is ~ 
significant. A choice “red” would correspond to the V-player assigning x;=x*;=1 
and would fix directions to (T;D,) and (E;F;). Then (STDISi+0 only forms no 
cycles with the new directed part of the board. A choice "green" would be symmetric 
(ie. x;=x*;=0 and only (S\F,E; Bi+v forms no cycles with the new ee part of 
the board). 


We have now reached the zth (z=n+K,, +1) move of Player I, and the 3-player 
has won his QBF game on I, using assignment {x*}. Thus the derived assignment 
{u(x*)} to the literals makes every clause of the formula of I, true. We can use the 
same move. as was the last move in Lemma 1 and trivially check that it is legal. 


"if" If I, is false we will prove, that although 2z-2 moves are possible, 2z moves 
are not, in CONFLICT +t (G(I,)). In this case —I,, is true and the v-player has a 
winning strategy in the QBF game. We will pattern the strategy of Player II on this 
strategy of the Y-player. 


Suppose that CONFLICT*(G(I,)) can last 2z moves. It is easy to'see, that 
every move of Player I must contain exactly one “green” edge, whose direction has 
not been fixed by previous moves, (observation (0) before Lemma 1). So we can view 
sequences of.z legal moves by Player I as permutations of the z “green” edges and 
name every move by the "green" edge, whose direction it fixes. 


(a) First let us look at legal PQ-moves, that is moves whose “green” unfixed 
edge belongs to a triangle. If this move (Ay) produces a cycle as in Fig. 4.7(a) we can 
infer the following: The edge (RQ) must belong to Ay'UA and Ay8UA. This is 
because Ay"UA must contain a directed path (P...Q) and QR 29 QP. (Recall that 
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QP is the only “green” edge without a fixed direction in Ay). Thus no matter what 
the response of Player II is to such a PQ-move the edge (RQ) becomes part of A. On 
the other hand a PQ-move producing a cycle.(Ax) as in Fig. 4.7(b) is never legal. 
This is because Ay8UA must contain {(PQ),(QR),(RP)} a cycle. The existence of a 
path (Q...P) in Ay'UA and the fact that RQ>_PR>pQP force this situation. Thus 
PQ-moves fix the direction of QR to (RQ). Finally if Player I: were ever to use a QR 
in the direction (QR), in some other e-move (e a "green" unfixed edge), then a 
response of “red” by Player II would consume two "green" edges (i.e., e and PQ). 
Therefore Player I should regard edges RQ as directed (RQ), in order to be able to play 
z times... 


(b) Let us examine the A,B;-moves i=1,3,..,n-] and F,Ej-moves i=2,4.,...,n. 
Since the directed edges of G’(I,) have to be respected, we can only have (B,A;)€Ax 
and (F;E,)€Ay for legal assignments in. these. moves. This is because AyUA must - 
contain a cycle and all other edges incident at A; (respectively Fj) have fixed 
outgoing (respectively ingoing) directions. Now we can justify the construction in 
Fig. 4.6(d) and 4.8. If (B,A))€Ax from the >p; order we have that (A;.,B,)é AxUA 
(eg. the direction of A;.,B; is fixed to (A;. 2B) because of the directed path 
(A;-A;-2 ;B;) in G’(,)). From the >; order we have that (B,5A;.») or (A;.2B,-2)€ 
AxUA. Similarly if (F,E)€Ax then (B;.)A;.7) or (A;.7B)-y)€ AxUA. We have 
established that the A;Brmove must precede the A;, 2B, 7 and F,, oa ip [-moves 
i= 1,3,...,n-l. 


. (c) Finally let us examine the C,, , ;S)-move. For this move we need a simple 

path (S)...C,,, ;) that respects directed edges in G’(I,), can contain no backedges of 
- G(,) (Similarly to (3) of Lemma 1), and has to pass.through S,, and Sa+1 (the last 
v-subgraph). If the F,,E,-move has not been played yet the use of either (T,D,) or 
(F,E,) by the (S}...C,,, ) path would fix the direction of F,E,. Thus the C,,S move 
has to follow all the A;B; and Fes i=], 2....)N. 


We will now show that Player II.can force Player I in a game, which simulates a 
QBF(I,,) game, where Player I is the 3-player, Player II is the v-player and moreover © 
has a winning strategy. Player I chooses the values of x; i=1,3,....n-1 and II the values _ 
_ Of x; i=2,4,....n. Player I determines when Player II makes his choices (as-long as x; 

precedes x;,, i=1,3,....n-1). Thus the best Player I can do is assign a value to X;, 
force II to assign a value to x», assign a value to x3, etc. Let us describe how these 
assignments take place. 
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(1) The A;B;-move assigns a value to x; i=] »3,.,n-1, The only possible choices 
for Ay are cycles (B;A;T;D,M;B;) for x*;=1 or cycles (B,A;F; EN; B;) for x*,;=0. This 
is because directed edges in G’(I,) must be respected, and for x*,=1 we have the 

following (x*;=0 is symmetric): 

(BiA;Bj42Ai+2-.) would use. up Bi 2Ai+2- . 
jA\T;D,F;E;...) would introduce a cycle in Ay‘UA. 

(B;A;T;D,S;,}...) would fix the direction of F,,,E)43- 

The strategy of Player II will be to always play “red”, ‘fixing the ciecueds of T;D;, 
FE, and making vertex A; inaccessible from S,. 


(2) The F F;E.-move assigns a value to x; i=2,4,....n. The arguments are exactly as 
in the v-subgraphs of Lemma 1. Player II's choice fixes the direction of F,E,, thereby 
making x*; 1 or 0 and allowing a unique path from S;, to S;, as in Lemma 1. Player 
II assigns values to the x*,’s according to the winning strategy of the v-player (recall 
that I, is false and thus the v-player has a winning strategy). 


As a result of all this aaalysis we see that when it is time for the C,, , ;S,-move, 
Player II has -forced F(x*),...x*,) to be false, and constrained (S,..C,,,3) to a 
unique path through -the 3- and ¥-subgraphs (e.g., the labels on the path are {x*} 
exactly as in Lemma YD. 


Thus the arguments of Lemma 1 apply to show that the C,,5}-move cannot be 
‘legal and CONFLICT*(G(I,)) cannot last 2z moves. O 
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We have now practically completed the proof of Theorem 4, 


Proof of Theorem 4: In Lemma 1 we have proven that CONFLICT(G) is 
I1?-hard, Using this lemma we have shown, in Lemma 3, that CONFLICT t(G) is 
PSPACE-Complete for G an ordered graph (no directed edges). In Lemma 2 we 
have shown the equivalence of PREFIX(<7.@>) and CONFLICT * (G(7)). In order 
~ to complete Theorem 4 all we have to do is argue that the ordered graph in the 
reduction of Lemma 3 is a conflict eraph for some 7: 


In fact G(l,)=(V.E9,{>})=6(1) because, 
V: every vertex i corresponds to a transaction T;. 
E: every edge e=ij corresponds to transactions T; and T; updating a uniquely 
defined variable x,. If eis "red” x, is stored at site 1, ife ‘ seal x, is stored at - 
site 2. 
{>;}: All orders are strict total orders, because every edge ij is assigned a different 
number at i, thus all vertices are realizable by transactions. 


Thus we have shown PREFIX(<7,9>) to be PSPACE-CompleteO 


- The question, sihetlier PREFIX(<7,a>) can last more than b moves, has several 
interesting subcases. 


For <T,a>: 
(1) G%(7) is a graph and {>;} are strict, (e.g., every. transaction updates a variable 
only once. Two transactions never share more than one variable. Three transactions 
do not share a variable). 
_ @) a= (eg., there are no directed edges or ‘all conflicts are wapeesivey 
(3) The transactions in T contain no cross-edges (€.g., each >; consists of two total 
orders one “red” and one "green". The "red" and "green" edges are incomparable. 
This actually means that there are no transaction defined messages). 
(4) The {>;} are of fixed size (e.g., no more than L actions per transaction). 


For b whether it is arbitrary or 0. 


These cases with their complexities are exhibited in Table 1 


conditions (< 7,a>) 


(1)&(2) 


(1)&(2)&(4) L=6 


(1)8(2)&(4) L=4 


(D&3)&(4) L=6 


(2)&(3) 
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Complexity 


PSPACE-Complete 
Theorem 4 


’ PSPACE-Complete 


Corollary 4.1 


“TP-hand 


Corollary 4.1 


PSPA CE-Complete 
Corollary 4.3 


in PSPACE 


Complexity (b=0) 


in NP 


in NP 


~ in NP 


NP-Complete 
Corollary 4.2 
in P 

Corollary 3.3 


Table 1: Is minimax length of PREFIX(<7,a>) greater than b ? 


Tt 


Corollary 4.1: Whether the minimax length of PREFIX(<7,9>) is greater than b | 
is PSPACE-Complete, even if the degree of the graph G(7Z) is less or equal to 6. It is 
1?-hard even if the degree of G(7Z) is less or equal to 4. 


Proof: We will slightly modify the gadgets of Cents 3 without changing the 
validity of its arguments. We replace clause-subgraphs with Fig. 4.9a, v-subgraphs 
with Fig. 4.9b and a-subgraphs with Fig. 4.9c. Let us for the moment ignore the 
nodes A;,B; i=1,3,...,.n-1. The construction gives us (by Lemma 1) that our decision 
problem is 11,?-hard. Moreover the only configurations at nodes are those of 
Fig. 4.10, thus our transactions need never have more than 4 actions. If on the other 
' hand we.add in nodes A, and B; and connect A;,B;,5,F,, using the subgraph of 
Fig. 4.11 then the arguments of Lemma 3 are still valid. The only difference from 
Lemma 3 is that A;B;-moves must precede the moves in the triangles corresponding 
to (A,B; , ») and (A,F; , }). We can thus show that our decision problem is PSPACE- 
Complete even if transactions are restricted to 6 actions. 


~ Therefore [X7.2>€ Moy is Si a i even if transaction systems 
are very restricted. . 


Consider the following combinatorial problem, which is in NP. 


.PATHG, st) 

Input: A mixed graph G=(V,E,A) (V=set of vertices, E=set of undirected 
edges, A=set of directed edges) and two distinguished nodes s and t. 
Output: Is there an assignment (A,) of directions to the edges in E, such that 
the digraph (V,A,UA) is acyclic, and contains a directed path from s to t? 


Note that, if A is acyclic, there is always an A,* such that (V,Ap*UA) is acyclic. . 
. Also it is easy to determine in the mixed graph G=(V,E,A) if t is reachable from s. 
But both conditions simultaneously are hard to decide. . 


Figure 4.9 


é 
be 
owe T 3 


Figure.4.10 | 
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Corollary 4.2: PATH(G;s,t) is NP-Complete, even if G has at most 2 undirected | 
edges incident at a node and at most 1 directed edge incident at a node. 


Proof: Consider Lemma 1, where all edges F|E, become “red”. Then the frst 
player chooses the values for all variables, and our QBF game becomes ‘the 
satisfiability problem. Finding a proper path (S)..C,,,,) would answer 
PATH(G(I,),S},C,,43). This and the refinements of Corollary 4.1 prove the 
Corollary. 


Corollary 4.3: Whether the minimax length of PREFIX(< T,a>) is greater than b 
is PSPACE-Complete (for b arbitrary) and NP-Complete (for b=0). This is true 
even if the transactions in 7 have no cross-edges and a fixed number of actions. 


Proof: Another way of stating Corollary 4.2 is that the decision question 
[<T,a>€M,{0)?] is co-NP-Compléte. The analysis that follows (for b arbitrary) also 
applies to this case. therefore determining if PREFIX(<7,a>) can last more than 0 


moves is a cadil 


In Lemma 3 we totally ordered all edges incident at a node, by assigning — 
numbers to them. Thus the transaction system realizing the {>,} of G(I,) had to 
have cross-edges. In fact cross-edges are the only way we ; know of forcing the 
creation of desired directed edges. ° 


Given an instance of QBF(I,) we can construct the ordered mixed graph G"(I,) 
as follows (recall the mixed graph G’(I,,) and the ordered graph G(I,) of Lemma 3): 


G"d,)=(V.EA.{>;}) where: 
(V,E,A)=G'(I,), with one exception. The edges A,B, 5 (i=1,3,....n-3) and A;F;,) 
(i=1,3,...,.n-1) are “green” and not "red". 
{>;} are those implied. by the orders of Gt) of Lemma 3. 


We can prove that I, is true iff CONFLICT*(G"(1,)) can last more than n 
moves. The argument that is needed to prove equivalence is identical with that of 
Lemma 3. Note that G"(I,,) has 2n “green” edges of, which n-1 have fixed directions. _ 


It is easy to see that a prefix <7,a> can be constructed, from a transaction system 
without cross-edges, such that G2(7)=G"(1,). (G%(T) is G(T) with the resolved 
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conflicts). Thus by Lemma 2, we complete the proof of Corollary 4.3. By using the | 
gadgets of Fig. 4.9 we can restrict the transaction systems to sets of transactions with 
~ at most 6 actions (e.g., the nodes A; have two "green", two “red” and two directed 
_ outgoing edges. "Green" and “red” edges at the same node are incomparable). 


This proves that the decision question [<T,a>€A1(b)?] is PSPACE-Complete 
even for 7's without cross-edges and with a fixed number of actions per 
transaction.O 


__ Fronr this analysis of special cases we see that two sets of constraints give us 
equal power: 
{D&(2)&(4) L=6} and {(1)&(3)&(4) L=6}- 


_ Let us now examine the final special case, namely b=0. Since we fix b we 


-” cannot use the equivalence above. From Corollary 4.2 we have that if a#@ and if T 


has no cross-edges the problem is NP-Complete. From Corollary 3.3 if T has no 
ctoss-edges and e=@ the problem is in P. . 


We have left open. two interesting problems: 


(a) Given T without cross-edges and b>0, is the minimax length of" 
PREFIX(<7,.@>) greater than b moves? We conjecture this problem is PSPACE- 
Complete. | : : | - 


(b) Given T can PREFIX(<7,@>) last more than 0 moves? This problem is in 
NP and we conjecture it is also in P. | 
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4.2 The Efficiency of Communication-Optimal Schedulers 


In the previous section we have analysed the complexity of various cases of 
PREFIX, or equivalently examined. various cases of the decision problem 
[<T.a>¢M {b)] | 


In Section 3.1 we described a programming system in which we can express all 
distributed schedulers. These schedulers consist of two processes, one at each site 
(Q;.Q5) and realize SR (Definition 12, Section 3.1). That is an input history h€H can ; 
lead to many possible computation paths. By executing the instructions on such a 
path the scheduler outputs a history in C. For each path the output is in C, moreover 
if heC and the delays of all messages are 0 the output must be h. We call the 
scheduler polynomial time bounded if the number of instructions the processes 


execute is bounded by a polynomial in n (for all possible paths). The size of the 


input is measured by n, which is the number of actions in 7. 


Corollary 4.4: Unless NP=PSPACE, there is no communication-optimal 
scheduler, which realizes SR and is polynomial time bounded. This is true even if 


_ each. transaction is restricted to be a sequence of six updates. 


Proof: Suppose such a scheduler Q existed. We know that Kraven fo | is 
PSPACE-Complete (even for restricted transaction systems, Corollary 4.1). We will 
prove that there exists a nondeterministic polynomial time bounded decision 
procedure for this problem. This would imply that NP= PSPACE, an unlikely fact. 


Given T and b>0 we do the following: 
(1) guess a history h=<Ta>€ SR. (this can be easily checked) 
(2) simulate the eee of Q on this history ) 


(3) whenever a message is sent we guess its delay and i in general guess a 
computation path of Q. 


(4) keep count (with m) of the number of messages sent 


(5) if m>b then say yes else say no 
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If [<7.9>¢€M {b)?] is true there will exist an input h and a computation path of 
Q, where more than b instructions are executed. We can guess the input and the 
computation path with a polynomial number of guesses, this is because the size of h 
is O(n) and all paths are polynomial bounded. Ifmbd that means that all schedulers 
have to use more than b messages for inputs from the transaction system T. This is 
obviously a nondeterministic polynomial time bounded algorithm for our problem.O 


Similar results would hold even if we augmented our programming system with 
the power to consult oracles in the polynomial! hierarchy [11] tt é., the hierarchy 
would collapse beyond a certain level). 


Let us note two open problems. 


(a) If we assume P= PSPACE it follows that we can construct efficient 
schedulers (in both measures). The consequences of NP=PSPACE on the vat 
hand are unclear. 


(bo) If the decision sable [<T. are fey" is only NP-hard the arguments of 
Corollary 44 no. longer apply. 


Our results: indicate that, a communication. optimal scheduler must. be 
computation inefficient. It is still possible to analyze the information in 7 and design 
various efficient, communication suboptimal realizations of SR. We will end this 
section by defining a simple open edge deletion problem. This problein can be used _ 
as an upper bound on the minimum number of messages in order to realize SR. 
Because of its. simplicity it is also of independent: combinatorial interest. 


DMC(G) 
Input: An undirected graph G, with ‘edges artioned into "red" and “green” 
Output: Find the minimum number of edges, whose deletion a ae a graph ~ 

~ with no cycles containing both “red” and “sreen” edges. 


5. The Combinatorics of Locking 


The most common technique used for the resolution of conflicts in concurrency 
control is locking. In this chapter we will extend the elegant analysis of locking 
described in [39] from the centralized to the distributed case. In the process, the 
geometric criterion of [39] will be replaced by a simple combinatorial condition 
(ie., the strong connectivity of a directed graph). . 


5.1 Distributed Locking 


Let us first present a simple extension of the definitions for locking, which 
appear in [39]. We will utilize the notions of Distributed Database Design (DDD), 
transaction, action, history and serializability from Section 2.1, with the following 
additions: . 


Definition 16: For the DDD=<Gp, Data, Stored-at, IC>, the Data is partitioned 
into variables (Var) and locking variables (LVar). The function lock-of. Var->LVar 
determines for every variable x, its lock X, (i.e., X is the /ock-of{x)). The constraint 
AcxeLvar) X=0 is part of the integrity constaints IC. 


We will use x for variable and X for its lock. Note that, as for all Data, locking 
variable X is stored-at site(X). We might have that site(x)* site(X) (e.g., a central site’ 
is used for all locks). We might have that X is the lock of x only and site(x)=site(X) 
(e.g., the fully distributed case). Or we could have two variables, which are at the 
same or different sites, and have the same lock (e.g., primary copy locking). The 
locks we will be dealing with are stored at a particular site, and are not global 
variables stored at many sites. 


The transactions and histories are partial orders.of actions as in Definitions 2 
and 4, but we can have more types of actions. 
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Definition 17: An action is either an update of a variable (in Var) as defined in 
Def. 3 or a lock X or unlock X step. for some -locking variable X (in LVar). 
{a) The semantics of “lock X" are, (X:= if X=0 then 1 else error) 

(b) The semantics of “unlock X" are, (X:= if X=1 then 0 else error) 
We abreviate “lock X" as Lx and “unlock X" as Ux, where X=l/ock-of{x).0 


Note that we are dealing with exclusive locks. We will not discuss shared \ocks 
(e.g., read or intention locks {13) 


Let T= {T},T}... »-sT gy} denote an (ordinary) transaction system, that i is without ; 
"lock" or “unlock” steps. 


Definition 18: A locking policy L is a mapping, which given an (ordinary) 
transaction system 7 transforms it into a locked transaction syste (7). The 
locking policy transforms each Tj; of T into LT) (i= 1,2,.. in), by inserting only 
Lx,Ux steps and precedences between them subject to the following constraints: 

(1) The only way .to insert Lx or Ux steps, is as a Lx-Ux pair with Lx before and 
Ux after an update of x, in the partial order of on! Moreover for each x there is at 
most one Lx-Ux pair in L(¥)). ape Bae 
. (2) For every update of an x in Tj there is a Lx before and an Ux after it in de 

partial order of L(Tj).0 


Note that a locking: policy could be nondeterministic (ie. it could produce many 
different L(1)'s for a given 7). . | 


‘Tn a locked transaction L(T;) all actions at the same site are totally ordered, by 
Def. 3 of transactions, As in. the case without locks,. a distributed locked transaction 
represents a set of total orders of its actions (i.e. those that respect its partial order), 
A new feature for the distributed case is: we can. have actions p.q concurrent in T; 
and Lx’s, Ux’s inserted in 'T;, with such precedences as to make P an ancestor of q in 
L(T;). In other words the locking polity can restrict the parallelism inherent in Tj. 
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Let h be a history (or a prefix of a history) of L(7). We say that h is /egal 
(i.e. preserves the IC of locks in Definition 16) if between any two occurrences of Lx 
in h there is an occurrence of Ux. We denote this as h€M(L(7)). Let Ll(n) be the 
induced subgraph of h if all lock and unlock steps were removed. The set of histories 
O(L)=L7“(M(L( 7))) is called the output of the locking policy L and captures the 
parallelism supported by L. 


Definition 19: A locked transaction system L(7) is safe if every history in O(L) 
is serializable. It is deadlock-free if for any legal prefix a of a ae of 17), there is . 
a suffix w such that aw € M((7).0 


It is easy to see that if L(7) is safe we can realize M(L(7) using a scheduler, 
which consists of a simple lock manager and a mechanism for avoiding or breaking 
deadlocks. The deadlock problem becomes more accute in a distributed 
_ environment, where it requires the use of messages [22,23]. 


As an cxenple of a distributed locking policy consider two-phase locking (2PL). 


2PL: All lock steps in a distributed locked transaction must precede all sala 
steps in the transaction’s partial order. 


Every total order consistent with a 2PL distributed transaction is a 2PL 
centralized transaction. Thus we’ can infer, from the safety of centralized 2PL, its 
safety for the distributed case. Similar easy generalizations exist for the safe and 
deadlock-free tree-[30], digraph-[39] or hypergraph-[39] policies, which apply to’ the 
structured Data case, 


An example of a distributed 2PL transaction system is presented in Figure 5.1. 
This example also shows that O(2PL) (i.e., the set of legal output histories without 
the locks) is not a concurrency control principle as defined in Def. 11 Section 2.1. 
This is because the ordering of lock, unlock steps introduces cross-edges that were 
not part of the initial transactions T. ; 


Our main task now will be to generalize the results of [39] towards a — 
characterization of safe systems. 
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5.2 The Safety of Distributed Locked Transaction Systems 


Let Tj (i=1,2) denote a pair of locked distributed transactions and T;,+ (i=1,2) a 
pair of totally ordered locked distributed transactions. The jth step of Tr is Ty 
1<j<m;. As noted above T;={T,+ | T,+ respects >;,} (i=1,2). 


Consider a transaction system {T;+, i=1,2}. In the coordinated plane 
(1; + .T2*) (see Fig. 5.2) take the two axes to correspond to T) t and T2*, and the 
integer points 1,2, etc. on these axes to correspond to the steps T}1*,T)7*, ete. 
(respectively T7) + Tot, etc.) of the transactions. A point p may represent a 
possible state of progress made toward the completion of T)+ and T2+. These 
transactions will contain properly nested lock-unlock steps. Each variable x such that 
both T] + and T>* -contain a Lx-Ux pair, has the effect of creating a forbidden 
region (a rectangle delimited by the grid tines corresponding to the Lx-Ux steps), the 
_ points of which do not represent reachable states (see Fig. 5.2). Adding such 
- rectangles to the plane has some consequences. For example, the point u is now 
reachable, yet not in any. rectangle; in contrast, point d is a state of deadlock. 


A history, that is totally ordered, has the following geometric image[39]. It is a 
nondecreasing curve from the point (0,0) to the point (m2+1, mj+1), not passing 
through any other grid point and not through any rectangle (e.g. h in Fig. 5.2). To 
read the history off any such curve we simply cnumerate the grid lines that it 
intersects. Two totally ordered serial histories are represented by the curves h},h in 
Fig. 5.2. | . 


From [39] we have the following characterization. 


Proposition 2: A history, which is totally ordered, for the transaction system 
{T;T, i=1,2}.is not serializable iff the corresponding curve separates two 
recientes Oo 


No two ree touch at a grid point (by our definition of locked transaction 
systems). In order to study the safety of {T; + i=1,2} the meter we have to 
consider are pairs of Lx-Ux steps, where both r+ ’s update x. The'following Lemma — 
for distributed locked transactions is a direct consequence of Proposition 2, because 
every nonserializable history corresponds to some set of totally ordered 
nonserializable histories. 


Lemma_l: A distributed locked transaction system {T,T2} is safe iff for all 
pairs T) +,T)+ there is no curve (corresponding toa history) that separates t two 
rane in the (Ty +,T)*)- plane. , | 


‘An example of an unsafe system {T},T }, aheie only relevant Lx-Ux steps are 
given, is provided by Fig. 5:3fa). In Fig. 5.3(b) we have a pair T, 1,1, that 
happens to be safe. In Fig 5.3(c) we have a ae Tyt Tht that illustrates why the | 
system is unsafe. 


Since there is an exponential number of possible pairs Tj +,T)* an iterative 
application of the test-of Proposition 2 (which involves an. O(nlognloglogn) 
computation of a “closure” ‘for a geometric region of rectangles [21]) is no longer 
efficient. . 


Our contribution will be an efficient combinatorial (as opposed to geometric) 
test (i.e. sufficient condition) of safety for distributed locked transaction systems. Our 
combinatorial test (Theorem 5) provides an alternate way of characterizing the 
centralized problem. It is also a necessary condition of safety (Theorem 6) for 
centralized transactions and transactions distributed at two sites. For more sites a _ 
complete and efficient characterization is an open problem. 
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(a) Distributed locked transactions (x at site 1 and y,z at site 2) 
(b) safe {T)+,T) +} . 

(c) unsafe {T,+,T7T} 

(d) DL(T,.T2) 
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Let us define: 


DL(T},T2): Given two locked distributed transactions T},T7 construct the 
digraph DL(T},T2)=(V.A) such that: 
(a) V the vertex set, with vertex x iff both T} and T2 contain a Lx-Ux pair. 
(b) A the arc set, with arc (xy) iff ( Ly >y, Ux and Lx >q, Uy ). 


An example of DL{T},T>) is presented in Fig. 5.3(d). From the definition of 
DL(T},T3) we have that (xy)€ A iff the upper-left corner of the x-rectangle is in the 
-lower-right corner formed by the y-rectangle on all possible (T, + .T2*+)-planes (see 
Fig. 5.4). This implies that in every such plane no curve corresponding to a history 
‘can pass below. the y-rectangle and above the x-rectangle. 


| angi mt) 


| Figure 5.4 | 
(xy)€ DL(T},T). Only three types of paths are at most feasible. 
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Theorem_5: Let {TT} be a locked transaction system. If i 1:12) is 
strongly connected, then {T},T>} is safe. 


Proof: Let T, and G conflict at variables xj1,X,....X,;. Then for 
DL(T},T2)=(V,A) we have V={x},x9,....X,}. 


In a (T]*,T2+)-plane we can associate every path s, that corresponds to a | 
possible output history of a lock manager, to a vector of k binary mus 
s=(bj,b3,....b,). These values are: 

bj=1 if s passes above the x;-rectangle 

bj=0 if s passes below the x;-rectangle | 

Therefore if (xjx;)€ A we can say that for all (T]+,T,+)-planes and paths s 


bj<bj (i.e. only b;=1, bj=1 or b;=0, bj=0 or bj=0, bj=1 are allowed). 


Since DL(T},T) is strongly connected there is a directed path (j.-2X)) anda. 
directed path (Xj..X}) for 1<ij<k, i+j. Thus always djs... <b; and bj<...<bj for all ij. 
This implies that the only allowable values for the vectors s are (0,0,...,0) and 
(1,1,....1). Thus for all (Tj * .T7+)-planes there is no path corresponding t to a history 
separating two rectangles. Therefore {T},T } is safe.a 


In order to characterize safety of a distributed system we need a succinct way of 
describing the forbidden- regions in all (T, + ,.T2*)-planes. We use this 
characterization (as in the proof of Theorem 5) to produce a short proof, that all 
paths, which correspond to output histories of a lock manager, must either pass 
below or above all forbidden regions. 


The simple condition of safety provided by Theorem 5 is a sufficient one. It is" 
necessary for centralized transactions (Lemma 2), where another obvious complete 
characterization is the geometric pattern on the unique (T) + T>*)-plane, It is also a 
necessary characterization for transactions distributed between two sites (Theorem 6). 
Recall that the safety question is in co-NP, whereas its negation is in NP, that is to 

- prove a system unsafe all we have to do is guess a nonserializable history in O(L) and 
verify that fact in polynomial time. 


We should point out that DL(T},T) ignores some of the precedences of Tj and 
T . This restricts the proof of necessity to two sites and indicates that a complete 
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characterization of forbidden regions for an arbitrary number of sites could be a hard 
- problem. co . mt 


If DL(T},T2)=(V.A) is not strongly connected then it has more than one 
strongly connected components. Among these there is a strongly connected 
component with no incoming edges from other strongly connected components. We 
_ call such a component a dominator X, where XCV denotes its set of nodes. In fact 
the only property of the dominator we will use is that there are no incoming edges in 
- X from nodes in V\X (and not its strong connectivity); 


We will prove necessity of the condition in Theorem 5 using the following 
intuitive construction. Given T},T>, DL(T},T2)=(V,A) not strongly connected and a 
dominator X, we will construct two special total orders T,+,T)+. In T,* the 
actions (Lx-Ux, x€X) will be executed as late as possible after the actions (Lz-Uz, 
z€X). In T)+ we do the opposite. ‘This tends to isolate the forbidden region 
corresponding to X in the upper left corner. Each time we will argue that this region 
and alf other rectangles can be separated as in Fig. 5.5, by a curve which will — 
obviously correspond to a possible output history. Therefore we will prove something 
stronger than lack of safety namely: “If X is such that there are no incoming edges in 
X, then we can separate all x-rectangles from all z-rectangles, x€ X, z€ V\X". 


Lemma 2:.Given a locked transaction system {T},T2}, where T},T3 are totally 
ordered, if DL{T},T) is not strongly connected then {T},T2} is unsafe. 


Proof: Obviously there is- only one (T 17.7 +)-plane. Pick a dominator X in 
DL(T},T3). By Theorem 5 all.its rectangles farm a region that is above an increasing 
curve, whose corners correspond to lower right corners of x;-rectangles, xj€ X 
(see Fig. 5.6). Let z€ X, then the z-rectangle must be below that curve. If it is not 
there is an xj€ X such that Lz >;) Ux; and Lx; >;, Uz (since Tj, Tp are totally 
ordered) implying that (zx;)€ DL(T},T2) a contradiction.O | 


Figure 5.6 | 


Theorem 6: Let T= {TT} be a locked transaction system, where T},T2 are 
distributed at two sites. If DL(T},T) is not strongly connected then 7 is unsafe. 


Proof: For this type of distributed transactions there could be. an exponential 
number of possible (T} + ,T7*}planes. Let X be-a dominator of DL(T},T>). We use 
X to construct two special total orders T,+,T>* that will help us separate all x- 
rectangles (x€ X), from all z-rectangles (z¢ X) and, since X and V\X are nonempty, 
this will provide us with a certificate of: unsafeness. We will use the shorter notation 
>, instead of >;, and >; for "precedes or can be concurrent to in transaction Tj". 


Let z, x, y be such (if they exist) that: 
() zé X and x,ye X . 
- (2) Lz >, Ux and Ly >, Uz © 
‘Then we can infer: 7 
(3) x#y and Uy >) Ux and Uy >, Ux. 
Since X is a dominator of DL(T},T) | it. cannot contain either of the directed edges 
(2x) or (zy). We can infer (3) because, ifx=y (zx)€ DLUT},T)), or if (Ux >, Uy) then 


(Lz >, Uy) and (zy)€ DL(T},T9), or finally if oe 1 Ly) then (Lx >, Uz) and (zx)eé 


DL(T},T). 


’ For any z, x, y satisfying o, 0 and @) we can construct the eile: partial 
orders: 

Ty is T, with the added or Ly >: Lx . 

Ty is T> with the added precedence Uy >» Ux 
Obviously T; (i=1,2) are partial orders. Also Tj is T; (=1,2) with at most one 
precedence added (Le., if the additional precedence. were already in T; then T, =T). 
Therefore if {Ty.T)} is unsafe so is {T,.T]. 


Based on the existence of only two sites we will prove the following important - 
fact about the new system T'= -{T] 7}: ° 


() X is a dominator of DUTY Tm) 


Since x, y, z are distinct variables we have three Gases; case (a) x,y stored at the 
- game site, case (b) x,y stored at different sites and z stored at the same site as x, 
case (c) x,y stored at different Sites and Zz ‘stored at the same, site as y. 
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Case _(a): If x,y are stored at the same site we must have (Ly >) Lx) and 
(Uy >, Ux) (these actions cannot be concurrent in Ty or T>). Therefore Tj =T; 
(i=1,2) and (1) follows trivially. 


Case (b): We have that x and z are stored at the same site and (Lz >, Ux) (the 
possible positions of Lz are illustrated in Fig. 5.7). Since (zx)¢ DL(T],T2) we must 
have (Uz >, Lx) (i.e. these actions cannot be concurrent in Ty, because x and z are at 
" the same site), Since (Ly >, Uz >, Lx), we have that already (Ly >, Lx) and therefore 
T, =T}. We only add precedence (Uy >, Ux) to T to obtain T2. 

The only way for new edges to be generated in DL(T} .T7) from a z’€ X into a 
x’€ X, is for (Lz’ >, Uy) and (Ux >4 Ux’) (x’ could be x). Moreover z’ and x’ should 
be stored at different sites (otherwise Lz’,Ux’ would have been ordered already in - 
T) and in T,)=T] we must have (Lx’ >, Uz’. 

If z’ and x were stored at the same site, x’ must be stored at the site of y. Thus in 
T> we must have had (Lz’ 2 Uy and Uy >, Ux’) (otherwise the new edge would 
have introduced a cycle in 'T2). Therefore Lz’ and Ux’ were already ordered i in T, a 
contradiction. 

If z’ and y were stored at the same site, x’ must be stored at the other site and 
Fig. 5.7 illustrates the possible pdsitions of Lz’ and Ux’ in Tz. From these ranges of 
Lz’ and Ux’ in T3, we can.derive the possible positions of Uz’ and Lx’ in Tj. Since | 
DL(T},T>) cannot contain either (z’y) or (zx’) and since (Lz’ >, Uy and Lz >, Ux’), 
we must have (Uz’ > , Ly and Uz >, Lx’). It easily follows from the established ranges 
that Tj contains a cycle (UzLx’Uz'LyUz) a contradiction. 

This proves (I) for this case. 


Case (c): This case is symmetric with case (b). The argument that proves (I) is 
similar to the one above. The ranges of Lz’, Uy’ in Tz and Uz’, Ly’ in Ty are 
illustrated in Fig. 5.8. This time the additional precedence is (Ly >) Lx), and z’€ X, 

y€ X, and z must ue stored at the site of x, and y at the site of y. 


This completes the proof of (I). 


Starting from 7 we caa construct a sequence of transaction systems 7,7 ...,T 
(of length polynomial in [7]) such that in T°: 
(i) X is a dominator of DLT)" iT *) 
(ii) If (z€ X), (x,y€ X), (Lz Sy Ux), (Ly >,* Uz) then i >2* Uy) (Ly >)+Ux). 
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Now all we have to do is produce the total orders T) +, T+ from topologically 
sorting Ty’, T)’. We use two tricks First, we place the Ux (i.e. x in X) steps as early 
"as possible in Tt. Second, we place the Lx (i.e. x in X) steps as /afe as possible in 

T+, moreover if Ux is before Ux’ in T)+ we put Lx before Lx’ in T,+ (if 
possible). 


It is easy to see that a nondecreasing curve lower-bounding the area of the 
rectangles in X is created. Also if (Ly >, 4 Uz) for some z€ X, and Ly forms part of 
this curve and is closest to Uz (see Fig. 5.9) then we can easily prove that 
(Ly >,» Uz). (From the way T)t was constructed, if there is a closer (Ly¢ >j* Uz). 
we must have (Ly >« Ly,) else Ly, would have been scheduled before. Ly in T)*). 
From the properties of T* we know that for all x€ X such that (Lz >y* Ux) we have 
(Uy >5« Ux). By the way T,t was constructed (Uy as early as ee) we can infer 
(Uy >24.L2z). 


Therefore z-rectangles are below or to the left (of all x-rectangles in the 
(T, +,T)+)-plane. This completes the proof of Theorem 6.0 


| 


Figure 5.9 


The condition of Theorem 6 cannot be applied to systems {T},T} distributed at 
more than two sites. An example demonstrating that fact is illustrated in Fig. 5.10, 
where although we have a dominator X ={x},x7} (Fig. 5.10(a)) we cannot separate it 
from the other rectangies (Fig. 5.10(b) and (c)). : 


xl 


x, x, 
Zz wW 
<>) 
Figure 5.10 


(a) T : 
(b) T ‘ is not a transaction system 
(c) DL(7) has dominator {x).x7} 
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Thus we can test safety of distributed transaction systems T= {T},T2}, on two 
sites in O(n) time [1]. In fact the proof of Theorem 6 gives us the following 
nondeterministic poly nomial time algorithm to decide if an arbitrary system T is 
unsafe, 


Algorithm UNSAFE: Given T={T},T>} a locked transaction system. 


(1) Guess a (nonempty) set of rectangles, X that are above a curve, which | 
corresponds to a nonserializable history. Let Z be the (nonempty) set of the rest of 


the rectangles. 


(2) Start with T;"=T}, T)"=T> and keep augmenting them by the following 
tule: | be | 
If z€ Z, xy€ X, (Lz >y* Ux), (Ly >;+ Uz) then add (Uy >2¢ Ux), (Ly >,» Lx). 


(3) Check if T}’, Ty" are partial orders and if DL(T;’, T ov has no edges (zx) 
for z€Z, x€ X. 


(4 If (3) is true say yes. 


The nondetenministic choice at step (1) indicates that the decision problem 
"Given T={T},T} is it safe?” may be co-NP-Complete. Such a result would be | 
interesting since it would illustrate the effect of multiple sites on the complexity of 
the problem. . 


¢ 


Until now we have discussed transaction systems T with two transactions. The 
question of safety of a system with an arbitrary number of centralized transactions is 
co-NP-Complete [39], because of a combinatorial condition introduced by the 

conflict graph G(7). Since the question of safety of a system of an arbitrary number 
of distributed transactions is in co-NWP, we cannot hope to indicate a difference 
between centralized and distributed by further pursuing this problem. 


Another interesting issue is that of deadlock freedom. For the centralized case 
the geometric approach used for safety [39] gives us a test of deadlock freedom at no 
extra cost. The approach using DL(T}.T2) does not have this nice property. 


101 


Therefore we have determined three interesting open problems: 


(a) Given a system {T1,T)} of arbitrary locked distributed transactions, is it 
safe? 


(b) Can the polynomial time bounds implied by Theorems 5 and 6 be improved 
using the special structure of DL(T,T)? 


(c) Given a system {T}),T } of locked distributed transactions, is it deadlock-free 
(even if two sites are used and the system is safe)? 
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6. Conclusions and Open Problems 


We have examined the complexity of distributed database concurrency control. 
We have provided a rigorous mathematical framework for the study of on-line 
distributed problems (Chapter 2), established a connection between distributed 
computation and combinatorial games (Chapter 3) and finally derived on negative 
(Chapter 4) and positive (Chapter 5) complexity results. 


Our main result (Theorem 4) shows that concurrency control, an on-Ine problem 
clearly in NP in the centralized case, is PSPACE-Complete in the distributed case. 
This result is quite strong, in that it holds for transaction systems of rather ordinary 
appearance (e.g., transactions consisting of sequences of six updates each). Also, the 
negative implications of our result (Corollary 4.4) are quite robust. For example, 
even if the scheduler is equipped with a powerful oracle belonging anywhere in the 
polynomial hierarchy, it still cannot minimize communication efficiently, - unless the 
polynomial hierarchy collapses.- 


‘Tn the process of proving this negative result, we have related distributed 
concurrency control to certain combinatorial games played on graphs. It could be 
that this connection is of some practical value, since the length of these games 
corresponds to counting messages. There is a more-or-less immediate heuristic for 
approximating an optimal strategy in the game CONFLICT. This heuristic is based 
on the following purely combinatorial problem, which is still open: 


(1) “Given an undirected graph with its edges colored red and 
green, find the. smallest set of edges that have to be deleted in 
order for the resultig graph to have no two-color cycle,” 


Other open problems fiom Chapter 4 are related to technical issues (D&D o or 
to the messages: v.s.. computation steps argument of Corollary 4.4 (IV)&(V). This last 
argument seems quite general in the context:of distributed computation. 


(If) Given T without cross-edges and b>0 is the minimax length 
of PREFIX(<7\2>) greater than b? (conjectured to be PSPACE- 
Complete) 


(III) Given 7 is the minimax length of PREFIX(<7,@>) greater 
than 0? (conjectured to be in P) 
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(IV) What are the consequences of NP=PSPACE on. the | 
existence of efficient schedulers? 


(V) Can a contradiction similar to Corollary 4.4 be derived if 
[<7,9>€M {b)?] is NP-Complete. 


In Chapter 5 a new O(n2) safety test was derived for two-transaction locked 
systems {T,T>}. This is a necessary and sufficient condition, if transactions are 


distributed at two sites, and sufficient otherwise. There are a number of interesting 


open problems, — 


(VI) Given {T,,T} distributed at an arbitrary number of sites 
are they safe? (conjectured to be co-NP-Complete) 


This would demonstrate the complexity introduced by the number of sites. 


(VI1) Given {T},T} distributed at two sites and safe, are they 
dead-lock free? 


Issues of local and global deadlocks. and acai deadlock managers 
recall the analysis of Chapters 3 and 4. 


(VIII) Can the polynomial bounds of O(n2) (n is number of 
nodes of the digraph DL) implied by Theorems 5 and 6 be 
improved: using the special structure of DL? 


This is possible in the O(nlognloglogn) centralized case. 


Finally our analysis of distributed locking can serve as the basis for the 
development of novel distributed locking strategies, which | are not simply 
generalizations of centralized rules. 


This empty page was substituted for a 
blank page tn the original document. 
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